Helion


Szczegóły ebooka

Data Forecasting and Segmentation Using Microsoft Excel

Data Forecasting and Segmentation Using Microsoft Excel


Data Forecasting and Segmentation Using Microsoft Excel guides you through basic statistics to test whether your data can be used to perform regression predictions and time series forecasts. The exercises covered in this book use real-life data from Kaggle, such as demand for seasonal air tickets and credit card fraud detection.

You'll learn how to apply the grouping K-means algorithm, which helps you find segments of your data that are impossible to see with other analyses, such as business intelligence (BI) and pivot analysis. By analyzing groups returned by K-means, you'll be able to detect outliers that could indicate possible fraud or a bad function in network packets.

By the end of this Microsoft Excel book, you'll be able to use the classification algorithm to group data with different variables. You'll also be able to train linear and time series models to perform predictions and forecasts based on past data.

  • Data Forecasting and Segmentation Using Microsoft Excel
  • Contributors
  • About the author
  • About the reviewer
  • Preface
    • Who this book is for
    • What this book covers
    • To get the most out of this book
    • Download the example code files
    • Download the color images
    • Conventions used
    • Get in touch
    • Share Your Thoughts
  • Part 1 An Introduction to Machine Learning Functions
  • Chapter 1: Understanding Data Segmentation
    • Segmenting data concepts
    • Grouping data in segments of two and three variables
    • Summary
    • Questions
    • Answers
    • Further reading
  • Chapter 2: Applying Linear Regression
    • Understanding the influence of variables in linear regression
    • Projecting values from predictor variables
    • Summary
    • Questions
    • Answers
    • Further reading
  • Chapter 3: What is Time Series?
    • Technical requirements
    • Understanding time series data
    • Designing the time series data model
      • Analyzing the air passenger 10-year data chart
      • Conducting a Durbin-Watson test on our 10-year data
      • Computing the centered moving average of each period lag of the data
      • Analyzing the seasonal irregularity
      • Trending component of the time series
    • Doing the forecast
    • Summary
    • Questions
    • Answers
    • Further reading
  • Part 2 Grouping Data to Find Segments and Outliers
  • Chapter 4: Introduction to Data Grouping
    • Technical requirements
    • Grouping with the K-means machine learning function
    • Finding groups of multiple variables
      • Calculating centroids and the optimal number of segments for one variable
      • Calculating centroids and the optimal number of segments for two or more variables
    • Understanding outliers
    • Summary
    • Questions
    • Answers
    • Further reading
  • Chapter 5: Finding the Optimal Number of Single Variable Groups
    • Technical requirements
    • Finding an optimal number of groups for one variable
      • Instructions to run the required add-in in Excel
      • Running K-means elbow to get the optimal number of groups
      • Passing the data values to the K-means elbow function
      • Executing and interpreting the resulting chart of the optimal number of groups
    • Running the K-means function to get the centroids or group average
    • Finding the groups and centroids of one-variable data with K-means and Excel
      • Assigning values for every group
      • Calculating the centroid or the average point for every group
      • Exploring the range of values for each segment
      • Finding the segments for products and profits
      • Finding the optimal number of groups using the K-means elbow function
      • Running the K-means function to do the group segmentation
    • Summary
    • Questions
    • Answers
    • Further reading
  • Chapter 6: Finding the Optimal Number of Multi-Variable Groups
    • Technical requirements
    • Calculating the optimal number of groups for two and three variables
      • Finding the optimal number of segments for two variables revenue and quantity
      • Using the elbow function to get the number of groups for three variables revenue, quantity, and month of sale
    • Determining the groups and average value (centroids) of two and three variables
      • Getting the groups with the K-means algorithm for two and three variables
      • Visualizing centroids or the average value of each group for two and three variables
      • Charting the product value range of each group for revenue, quantity, and month
    • Using the Elbow and K-means functions with four variables
    • Summary
    • Questions and answers
    • Answers
    • Further reading
  • Chapter 7: Analyzing Outliers for Data Anomalies
    • Technical requirements
    • Representing the data in a 3D chart
      • Kaggle credit card fraud dataset
      • Kaggle suspicious logins
      • Kaggle insurance money amount complaints
    • K-means data grouping
      • Running the elbow algorithm
      • Kaggle credit card fraud dataset
      • Running the K-means function
    • Pivot analysis of the outliers
      • Kaggle credit card fraud dataset
      • Kaggle suspicious logins
      • Kaggle insurance money amount complaints
    • Summary
    • Questions
    • Answers
    • Further reading
  • Part 3 Simple and Multiple Linear Regression Analysis
  • Chapter 8: Finding the Relationship between Variables
    • Technical requirements
    • Charting the predictive models regression variables
      • Plotting the variables to analyze the possible relationship
    • Calculating the linear model confidence percentage
      • Coefficient of determination
      • Correlation coefficient
    • Statistical significance of the slope
    • The regression model's value ranges
    • Summary
    • Questions
    • Answers
    • Further reading
  • Chapter 9: Building, Training, and Validating a Linear Model
    • Technical requirements
    • Calculating the intercept and slope with formulas
    • Computing coefficient significance t-statistics and p-value
      • Coefficient of determination
      • Coefficient of correlation
      • t-statistics and p-value
    • Getting the residual standard error
    • Calculating the r-squared
    • Calculating the f-statistics
    • Training and testing the model
    • Doing prediction scenarios with the regression model
    • Summary
    • Questions
    • Answers
  • Chapter 10: Building, Training, and Validating a Multiple Regression Model
    • Technical requirements
    • Exploring the variables with more influence
    • Calculating t-statistics and p-values
      • Determination coefficient
      • Correlation coefficient
      • t-statistics
      • p-value
    • Calculating residuals standard error and f-statistics
      • Calculating residuals standard errors
      • Calculating f-statistics
    • Training and testing the model
      • Writing a linear regression model formula
      • Building the prediction model
    • Summary
    • Questions
    • Answers
    • Further reading
  • Part 4 Predicting Values with Time Series
  • Chapter 11: Testing Data for Time Series Compliance
    • Technical requirements
    • Visualizing seasonal trends
    • Researching autocorrelation past values' influence over present values
    • Performing the Durbin-Watson autocorrelation test
      • Calculating Durbin-Watson by hand in Excel
    • Summary
    • Questions and answers
    • Answers
    • Further reading
  • Chapter 12: Working with Time Series Using the Centered Moving Average and a Trending Component
    • Technical requirements
    • Calculating the CMA
      • Calculating the moving average and CMA
      • Estimating the season irregular and season components
      • Calculating the trend line
    • Producing the forecast season and trend line
    • Summary
      • Questions
    • Answers
    • Further reading
  • Chapter 13: Training, Validating, and Running the Model
    • Technical requirements
    • Training the model
      • Conducting the Durbin-Watson test
    • Building and training the forecast model
    • Testing the forecast model
    • Doing the forecast
    • Summary
    • Questions
    • Answers
    • Why subscribe?
  • Other Books You May Enjoy
    • Packt is searching for authors like you
    • Share Your Thoughts