Experian Decision Analytics

 

 

HOW TO REGISTER

Below are the courses curated specifically for Experian's Decision Analytics Teams. Click on the links to review the course syllabus. Next, consult with your Team Lead, and request authorization to register for the course of your choice. Your Team Lead will provide you with the Experian promo code. Go back to the course syllabus page. Using the orange “Register” button, click to access the secure registration page. To activate the special tuition rate, enter the special Experian promotional code in the box maked "Promo Code" - this will apply your discount to the courses listed on this page.

Continue to enter your registration details: name, email address, credit card. Click “Place Order” - you will be successfully registered for the course.  We will contact with you by email with registration confirmation, welcome message, and one week prior to the course start date, access details for the course itself.

NOTE: Several of these courses require texts; an in-house library is being established in each regional office. You will be able to borrow the text required as needed from this library. You are welcome to purchase your own copy of the texts (some of which are available as e-books). Check the “Requirements” tab on the course syllabus page for a special link to order these books.

At the conclusion of the course, you will be expected to submit a copy of the Record of Completion you earn to your manager. Our assistant teachers in every course will help you stay on track. Work hard, and you will do well.

For  questions about this program, and authorization to take a course, speak with your team leader. For questions regarding the scope and arrangement of this programme, contact the manager for Experian, the DA Learning Academy team, at DALearningAcademy@uk.experian.com . For specific questions about the courses or access, please contact the Registrar for The institute, Valerie Troiano at vtroiano@statistics.com.

 

COURSE LISTING

Here are The Institute’s courses selected for your program. They are grouped by subject, and listed in terms of increasing difficulty. Note that courses labeled "1" should be taken before "2", "2" before "3," etc. Be sure to review the specific prerequisites on the syllabus page for each course.

 

INTRODUCTORY

1. Intro Stats for Credit - This course is the equivalent of a semester course in introductory statistics.  The course features the use of resampling and simulation techniques (picking numbers from a hat, throwing dice, etc.) that make probabilistic statistics more transparent and understandable.  The goal is real understanding, not cookbook learning, and even the most anxious novice (as well as the expert!) will benefit. Note that this course tuition is a special package price, already discounted, so the promo code will not be valid for this course.

2. R programming - Introduction 1- This course will provide a basic introduction to R, and its use in organizing and exploring data. The emphasis is on understanding and working with fundamental R data structures and we will introduce some basic R programming techniques. Once you've completed this course you'll be able to enter, save, retrieve, manipulate, and summarize data using R; you will also have the proper foundation to build your programming skills in R and take advantage of the full power of R.

3. R Programming Introduction 2 - The aim of this course is to teach R Programming to those with some programming knowledge or experience.  It covers simple arithmetic, vector operations, writing functions, the role of user-created packages, logical operations, working with text data, categorical data, time/date data, and data frames. It takes a step-by-step approach, with plenty of examples, to ensure that the basics are effectively mastered.

4. Social Network Analysis - Social Network Analysis has existed for a long time, but social media has fundamentally changed the way we do this analysis. Data has become more plentiful and easy to collect, but this has pushed the boundaries of existing techniques. Sociological methods do not easily scale to the size of these networks, but purely statistical methods miss the complex social interactions that take place. This course will teach a mix of quantitative and qualitative methods for describing, measuring and analyzing social networks. We will learn how to identify influential individuals, track the spread of information through networks, and see how to use these techniques on real problems.

 


DATA MINING

1. Visualization - This course is about the interactive exploration of data, and how it is achieved using state-of-the-art data visualization software.

2. Predictive Analytics 1 - This course will introduce you to the basic concepts in predictive modeling, the most prevalent form of data mining. This course covers the two core paradigms that account for most business applications of predictive modeling: classification and prediction. In both cases, predictive modeling takes data where a variable of interest is known and develops a model that relates this variable to a series of predictor variables. In classification, the variable of interest is categorical ("purchased something" vs. "has not purchased anything"). In prediction, the variable of interest is continuous ("dollars spent"). Five techniques will be used: k-nearest neighbors, classification and regression trees (CART), neural nets, logistic regression and multiple linear regression. The course will also cover the use of partitioning to divide the data into training data (data used to build a model), validation data (data used to assess the performance of different models, or, in some cases, to fine tune the model) and test data (data used to predict the performance of the final model). The course includes hands-on work with XLMiner, a data-mining add-in for Excel.

3. Big Data Computing with Hadoop [note substantial IT related prerequisites] - This class will introduce statisticians to Hadoop, and provide an exemplar workflow for using Hadoop, writing MapReduce jobs, and finally leveraging Hadoop Streaming to conclude work in an analytics programming language such as R.  In this course you will learn:

-What are Hadoop and the software components of the Hadoop Ecosystem
-How to manage data on a distributed file system
-How to write MapReduce jobs to perform computations with Hadoop
-How to utilize Hadoop Streaming to output jobs

4. Predictive Analytics 3 - Data mining, the art and science of learning from data, covers a number of different procedures. This course covers key unsupervised learning techniques: association rules, principal components analysis, and clustering. (Predictive Analytics 1 & 2 cover techniques that are used to predict a record's class, or the value of an outcome variable on the basis of a set of records with known outcomes). The course will include an integration of supervised and unsupervised learning techniques. This is a hands-on course -- participants in the course will have access to an Excel-based comprehensive tool for data-mining, XLMiner, the use of which will be explained in the course. Participants will apply data mining algorithms to real data, and will interpret the results.

5. Applied Predictive Analytics - The goal of this course is to teach users (who have basic knowledge of R programming, predictive analytics and statistics) to apply machine learning techniques in real world case studies. This course provides  a hands on approach,  presenting the opportunity to participate in a private educational competition.

6. Text Mining -This course will introduce the essential techniques of text mining, understood here as the extension of data mining's standard predictive methods to unstructured text. This course will discuss these standard techniques, and will devote considerable attention to the data preparation and handling methods that are required to transform unstructured text into a form in which it can be mined.

7. Cluster Analysis - This course will teach you how to use various cluster analysis methods to identify possible clusters in multivariate data. In marketing applications, clusters of customer records are called market segments (and the process is called market segmentation). Methods discussed include:

-hierarchical clustering (in which smaller clusters are nested inside larger clusters)
-k-means clustering
-two-step clustering
-normal mixture models for continuous variables

8. Sentiment Analysis – This course is designed to give you an introduction to the algorithms, techniques and software used in sentiment analysis. Their use will be illustrated by reference to existing applications, particularly product reviews and opinion mining. The course will try to make clear both the capabilities and the limitations of these applications. For real-world applications, sentiment analysis draws heavily on work in computational linguistics and text-mining. At the completion of the course, a student will have a good idea of the field of sentiment analysis, the current state-of-the-art and the issues and problems that are likely to be the focus of future systems.

9. Natural Language Processing using NLTK - In this course you will be using Python and a module called NLTK - the Natural Language Tool Kit to perform natural language processing on medium size text corpora. NLTK provides analysts, software developers, researchers, and students cutting edge linguistic and machine learning tools that are on par with traditional NLP frameworks. Because it is Python, it comes with "batteries included" and is available on most systems, providing a low barrier to entry for language processing in general, and in particular allows you to quickly and easily analyze text data in larger applications.

 

OPERATIONS RESEARCH

1. Risk, Simulation and Queuing -This online course, "Risk Simulation and Queuing" covers three important modeling techniques.  Students will learn how to construct and implement simulation models to model (1) the uncertainty in decision input variables (e.g. price, demand, etc.), so that the overall estimate of interest from a model can be supplemented by a risk interval of possible other outcomes (risk simulation), and (2) the variability in arrivals over time (customers, cars at a toll plaza, data packets, etc.) and ensuing queues (queuing theory). Students will also learn how to employ decision trees to incorporate information derived from models to actually make optimal decisions.  Students will use spreadsheet-based software to specify and implement models.

2. Optimization-Linear Programming - Scarcity is a dominant feature of the economic landscape - time, labor and other inputs into business processes.  The essence of management is to make choices that make optimal use of scarce resources. Students in this course will learn how to apply linear programming to complex systems to make better decisions - decisions that increase revenue, decrease costs, or improve efficiency of operation.   The course introduces the role of mathematical models in decision-making, then covers how to formulate basic linear programming models for decision problems where multiple decision need to be made in the best possible way, while simultaneously satisfying a number of logical conditions (or constraints).  Students will use spreadsheet software to implement and solve these linear programming problems.

3. Financial Risk Modeling -This course will cover the most important principles, techniques and tools in Financial Quantitative Risk Analysis. The course has been developed to effectively combine theoretical sessions with classroom examples and exercises in order to provide students with a comprehensive analysis of Monte Carlo techniques. In addition to discussions of recent innovations in the application of Monte Carlo methods, the course will cover many practical examples, case studies and interactive sessions

4. Integer and Non-Linear Programming - Many business problems involve flows through a network - transportation, stages of an industrial process, routing of data.  Students taking this course will learn to specify and implement optimization models that solve network problems (what is the shortest path through a network, what is the least cost way to route material through a network with multiple supply nodes and multiple demand nodes).  Students will also learn how to solve Integer Programming (IP) problems (constrained optimization problems except with one or more decision variable constrained to be an integer: e.g. a firm setting up a wi-fi hotspot could use 2 routers or 3 routers, but not 2.5 routers), and Nonlinear Programming (NLP) problems (where the objective function and constraints are not linear functions of the decision variables.  Students will use spreadsheet-based software to specify and implement models.

 

BAYESIAN

1. Bayesian Statistics - Learn how to perform Bayesian analysis for a binomial proportion, a normal mean, the difference between normal means, the difference between proportions, and for a simple linear regression model.

2. Introduction to Bayesian Computing and Techniques - learn why Bayesian computing has gained wide popularity, and how to apply Markov Chain Monte Carlo techniques (MCMC) to Bayesian statistical modeling using WinBUGS software. Participants will learn how to use WinBUGS software, use it to estimate parameters of standard distributions, and implement simple regression models.

3. Bayesian MCMC - Learn how to apply Markov Chain Monte Carlo techniques (MCMC) to Bayesian statistical modeling using WinBUGS and R software. Topics covered include Gibbs sampling and the Metropolis-Hastings method. Participants will also learn how to implement linear regression (normal and t errors), poisson and loglinear regression, and binary/binomial regression using WinBUGS.

 

CLASSICAL STATISTICS

1. Epidemiologic Statistics - This is an introductory epidemiology course that emphasizes the underlying concepts and methods of epidemiology. Topics covered in the course include: study designs (clinical trials, observational studies, case-control studies, and cross-sectional), measures of disease frequency and treatment effect, and controlling for extraneous factors.

2. Design of Experiments - This course will teach you how to use experiments to gain maximum knowledge at minimum cost. For processes of any kind that have measurable inputs and outputs, Design of Experiments (DOE) methods guide you in the optimum selection of inputs for experiments, and in the analysis of results.

3. Forecasting Analytics - This course will teach you how to choose an appropriate time series forecasting method, fit the model, evaluate its performance, and use it for forecasting.

4. Categorical Data Analysis - This course will cover the analysis of contingency table data (tabular data in which the cell entries represent counts of subjects or items falling into certain categories). Topics include tests for independence (comparing proportions as well as chi-square), exact methods, and treatment of ordered data. Both 2-way and 3-way tables are covered. A modeling approach to categorical data analysis will also be presented, which is motivated through special cases of the generalized linear model, specifically Poisson regression for count responses and logistic/ probit regression for binomial responses. The focus will be on interpretation of models rather than the theory behind them.

5. Logistic Regression - This course will cover the functional form of the logistic model and how to interpret model coefficients. The concepts of "odds" and "odds ratio" are examined, as well as "risk ratio" and the difference between the two statistics. Our emphasis is on model construction, interpretation, and goodness of fit. Exercises include hands-on computer problems

6. Survival Analysis - This course describes the various methods used for modeling and evaluating survival data, also called time-to-event data. Survival models are used in biostatistical, epidemiological, and a variety of health related fields. They are also used for research in the social sciences as well as the physical and biological sciences, including, economic, sociological, psychological, political, and anthropological data. Survival analysis also has been applied to the field of engineering, where it typically referred to as reliability analysis.

7. Bootstrap Methods - This course covers the basic theory and application of the bootstrap family of procedures, with the emphasis on applications. After taking this course, participants will be able to use the bootstrap procedure to assess bias and variance, test hypotheses, and produce confidence intervals. The bootstrap is illustrated also for regression and time series procedures. Basic and improved bootstrap procedures are covered.

Want to be notified of future courses?

Yes
Student comments