Analiza danych

505
Ebook

Practical Business Intelligence. Optimize Business Intelligence for Efficient Data Analysis

Ahmed Sherif

Business Intelligence (BI) is at the crux of revolutionizing enterprise. Everyone wants to minimize losses and maximize profits. Thanks to Big Data and improved methodologies to analyze data, Data Analysts and Data Scientists are increasingly using data to make informed decisions. Just knowing how to analyze data is not enough, you need to start thinking how to use data as a business asset and then perform the right analysis to build an insightful BI solution. Efficient BI strives to achieve the automation of data for ease of reporting and analysis. Through this book, you will develop the ability to think along the right lines and use more than one tool to perform analysis depending on the needs of your business. We start off by preparing you for data analytics. We then move on to teach you a range of techniques to fetch important information from various databases, which can be used to optimize your business.The book aims to provide a full end-to-end solution for an environment setup that can help you make informed business decisions and deliver efficient and automated BI solutions to any company.It is a complete guide for implementing Business intelligence with the help of the most powerful tools like D3.js, R, Tableau, Qlikview and Python that are available on the market.

506
Ebook

Practical Computer Vision. Extract insightful information from images using TensorFlow, Keras, and OpenCV

Abhinav Dadhich

In this book, you will find several recently proposed methods in various domains of computer vision. You will start by setting up the proper Python environment to work on practical applications. This includes setting up libraries such as OpenCV, TensorFlow, and Keras using Anaconda. Using these libraries, you'll start to understand the concepts of image transformation and filtering. You will find a detailed explanation of feature detectors such as FAST and ORB; you'll use them to find similar-looking objects.With an introduction to convolutional neural nets, you will learn how to build a deep neural net using Keras and how to use it to classify the Fashion-MNIST dataset. With regard to object detection, you will learn the implementation of a simple face detector as well as the workings of complex deep-learning-based object detectors such as Faster R-CNN and SSD using TensorFlow. You'll get started with semantic segmentation using FCN models and track objects with Deep SORT. Not only this, you will also use Visual SLAM techniques such as ORB-SLAM on a standard dataset. By the end of this book, you will have a firm understanding of the different computer vision techniques and how to apply them in your applications.

507
Ebook

Practical Data Analysis. For small businesses, analyzing the information contained in their data using open source technology could be game-changing. All you need is some basic programming and mathematical skills to do just that

Hector Cuesta

Plenty of small businesses face big amounts of data but lack the internal skills to support quantitative analysis. Understanding how to harness the power of data analysis using the latest open source technology can lead them to providing better customer service, the visualization of customer needs, or even the ability to obtain fresh insights about the performance of previous products. Practical Data Analysis is a book ideal for home and small business users who want to slice and dice the data they have on hand with minimum hassle.Practical Data Analysis is a hands-on guide to understanding the nature of your data and turn it into insight. It will introduce you to the use of machine learning techniques, social networks analytics, and econometrics to help your clients get insights about the pool of data they have at hand. Performing data preparation and processing over several kinds of data such as text, images, graphs, documents, and time series will also be covered.Practical Data Analysis presents a detailed exploration of the current work in data analysis through self-contained projects. First you will explore the basics of data preparation and transformation through OpenRefine. Then you will get started with exploratory data analysis using the D3js visualization framework. You will also be introduced to some of the machine learning techniques such as, classification, regression, and clusterization through practical projects such as spam classification, predicting gold prices, and finding clusters in your Facebook friends' network. You will learn how to solve problems in text classification, simulation, time series forecast, social media, and MapReduce through detailed projects. Finally you will work with large amounts of Twitter data using MapReduce to perform a sentiment analysis implemented in Python and MongoDB. Practical Data Analysis contains a combination of carefully selected algorithms and data scrubbing that enables you to turn your data into insight.

508
Ebook

Practical Data Analysis. Pandas, MongoDB, Apache Spark, and more - Second Edition

Hector Cuesta, Dr. Sampath Kumar

Beyond buzzwords like Big Data or Data Science, there are a great opportunities to innovate in many businesses using data analysis to get data-driven products. Data analysis involves asking many questions about data in order to discover insights and generate value for a product or a service.This book explains the basic data algorithms without the theoretical jargon, and you’ll get hands-on turning data into insights using machine learning techniques. We will perform data-driven innovation processing for several types of data such as text, Images, social network graphs, documents, and time series, showing you how to implement large data processing with MongoDB and Apache Spark.

509
Ebook

Practical Data Analysis Using Jupyter Notebook. Learn how to speak the language of data by extracting useful and actionable insights using Python

Marc Wintjen, Andrew Vlahutin

Data literacy is the ability to read, analyze, work with, and argue using data. Data analysis is the process of cleaning and modeling your data to discover useful information. This book combines these two concepts by sharing proven techniques and hands-on examples so that you can learn how to communicate effectively using data.After introducing you to the basics of data analysis using Jupyter Notebook and Python, the book will take you through the fundamentals of data. Packed with practical examples, this guide will teach you how to clean, wrangle, analyze, and visualize data to gain useful insights, and you'll discover how to answer questions using data with easy-to-follow steps.Later chapters teach you about storytelling with data using charts, such as histograms and scatter plots. As you advance, you'll understand how to work with unstructured data using natural language processing (NLP) techniques to perform sentiment analysis. All the knowledge you gain will help you discover key patterns and trends in data using real-world examples. In addition to this, you will learn how to handle data of varying complexity to perform efficient data analysis using modern Python libraries.By the end of this book, you'll have gained the practical skills you need to analyze data with confidence.

510
Ebook

Practical Data Quality. Learn practical, real-world strategies to transform the quality of data in your organization

Robert Hawker, Nicola Askham

Poor data quality can lead to increased costs, hinder revenue growth, compromise decision-making, and introduce risk into organizations. This leads to employees, customers, and suppliers finding every interaction with the organization frustrating.Practical Data Quality provides a comprehensive view of managing data quality within your organization, covering everything from business cases through to embedding improvements that you make to the organization permanently. Each chapter explains a key element of data quality management, from linking strategy and data together to profiling and designing business rules which reveal bad data. The book outlines a suite of tried-and-tested reports that highlight bad data and allow you to develop a plan to make corrections. Throughout the book, you’ll work with real-world examples and utilize re-usable templates to accelerate your initiatives.By the end of this book, you’ll have gained a clear understanding of every stage of a data quality initiative and be able to drive tangible results for your organization at pace.

511
Ebook

Practical Data Science Cookbook. Data pre-processing, analysis and visualization using R and Python - Second Edition

Prabhanjan Narayanachar Tattar, Bhushan Purushottam Joshi, Sean Patrick Murphy, ABHIJIT DASGUPTA, ...

As increasing amounts of data are generated each year, the need to analyze and create value out of it is more important than ever. Companies that know what to do with their data and how to do it well will have a competitive advantage over companies that don’t. Because of this, there will be an increasing demand for people that possess both the analytical and technical abilities to extract valuable insights from data and create valuable solutions that put those insights to use. Starting with the basics, this book covers how to set up your numerical programming environment, introduces you to the data science pipeline, and guides you through several data projects in a step-by-step format. By sequentially working through the steps in each chapter, you will quickly familiarize yourself with the process and learn how to apply it to a variety of situations with examples using the two most popular programming languages for data analysis—R and Python.

512
Ebook

Practical Data Wrangling. Expert techniques for transforming your raw data into a valuable source for analytics

Allan Visochek

Around 80% of time in data analysis is spent on cleaning and preparing data for analysis. This is, however, an important task, and is a prerequisite to the rest of the data analysis workflow, including visualization, analysis and reporting. Python and R are considered a popular choice of tool for data analysis, and have packages that can be best used to manipulate different kinds of data, as per your requirements. This book will show you the different data wrangling techniques, and how you can leverage the power of Python and R packages to implement them.You’ll start by understanding the data wrangling process and get a solid foundation to work with different types of data. You’ll work with different data structures and acquire and parse data from various locations. You’ll also see how to reshape the layout of data and manipulate, summarize, and join data sets. Finally, we conclude with a quick primer on accessing and processing data from databases, conducting data exploration, and storing and retrieving data quickly using databases.The book includes practical examples on each of these points using simple and real-world data sets to give you an easier understanding. By the end of the book, you’ll have a thorough understanding of all the data wrangling concepts and how to implement them in the best possible way.

513
Ebook

Practical Deep Learning at Scale with MLflow. Bridge the gap between offline experimentation and online production

Yong Liu, Dr. Matei Zaharia

The book starts with an overview of the deep learning (DL) life cycle and the emerging Machine Learning Ops (MLOps) field, providing a clear picture of the four pillars of deep learning: data, model, code, and explainability and the role of MLflow in these areas.From there onward, it guides you step by step in understanding the concept of MLflow experiments and usage patterns, using MLflow as a unified framework to track DL data, code and pipelines, models, parameters, and metrics at scale. You’ll also tackle running DL pipelines in a distributed execution environment with reproducibility and provenance tracking, and tuning DL models through hyperparameter optimization (HPO) with Ray Tune, Optuna, and HyperBand. As you progress, you’ll learn how to build a multi-step DL inference pipeline with preprocessing and postprocessing steps, deploy a DL inference pipeline for production using Ray Serve and AWS SageMaker, and finally create a DL explanation as a service (EaaS) using the popular Shapley Additive Explanations (SHAP) toolbox.By the end of this book, you’ll have built the foundation and gained the hands-on experience you need to develop a DL pipeline solution from initial offline experimentation to final deployment and production, all within a reproducible and open source framework.

514
Ebook

Practical GIS. Learn novice to advanced topics such as QGIS, Spatial data analysis, and more

Gábor Farkas

The most commonly used GIS tools automate tasks that were historically done manually—compiling new maps by overlaying one on top of the other or physically cutting maps into pieces representing specific study areas, changing their projection, and getting meaningful results from the various layers by applying mathematical functions and operations. This book is an easy-to-follow guide to use the most matured open source GIS tools for these tasks.We’ll start by setting up the environment for the tools we use in the book. Then you will learn how to work with QGIS in order to generate useful spatial data. You will get to know the basics of queries, data management, and geoprocessing.After that, you will start to practice your knowledge on real-world examples. We will solve various types of geospatial analyses with various methods. We will start with basic GIS problems by imitating the work of an enthusiastic real estate agent, and continue with more advanced, but typical tasks by solving a decision problem. Finally, you will find out how to publish your data (and results) on the web. We will publish our data with QGIS Server and GeoServer, and create a basic web map with the API of the lightweight Leaflet web mapping library.

515
Ebook

Practical Machine Learning Cookbook. Supervised and unsupervised machine learning simplified

Atul Tripathi

Machine learning has become the new black. The challenge in today’s world is the explosion of data from existing legacy data and incoming new structured and unstructured data. The complexity of discovering, understanding, performing analysis, and predicting outcomes on the data using machine learning algorithms is a challenge. This cookbook will help solve everyday challenges you face as a data scientist. The application of various data science techniques and on multiple data sets based on real-world challenges you face will help you appreciate a variety of techniques used in various situations.The first half of the book provides recipes on fairly complex machine-learning systems, where you’ll learn to explore new areas of applications of machine learning and improve its efficiency. That includes recipes on classifications, neural networks, unsupervised and supervised learning, deep learning, reinforcement learning, and more.The second half of the book focuses on three different machine learning case studies, all based on real-world data, and offers solutions and solves specific machine-learning issues in each one.

516
Ebook

Practical Predictive Analytics. Analyse current and historical data to predict future trends using R, Spark, and more

Ralph Winters

This is the go-to book for anyone interested in the steps needed to develop predictive analytics solutions with examples from the world of marketing, healthcare, and retail. We'll get startedwith a brief history of predictive analytics and learn about different roles and functions people play within a predictive analytics project. Then, we will learn about various ways of installing R along with their pros and cons, combined with a step-by-step installation of RStudio,and a description of the best practices for organizing your projects.On completing the installation, we will begin to acquire the skills necessary to input, clean, and prepare your data for modeling. We will learn the six specific steps needed to implement andsuccessfully deploy a predictive model starting from asking the right questions through model development and ending with deploying your predictive model into production. We will learn whycollaboration is important and how agile iterative modeling cycles can increase your chances of developing and deploying the best successful model.We will continue your journey in the cloud by extending your skill set by learning about Databricks and SparkR, which allow you to develop predictive models on vast gigabytes of data.

517
Ebook

Practical Real-time Data Processing and Analytics. Distributed Computing and Event Processing using Apache Spark, Flink, Storm, and Kafka

Shilpi Saxena, Saurabh Gupta

With the rise of Big Data, there is an increasing need to process large amounts of data continuously, with a shorter turnaround time. Real-time data processing involves continuous input, processing and output of data, with the condition that the time required for processing is as short as possible.This book covers the majority of the existing and evolving open source technology stack for real-time processing and analytics. You will get to know about all the real-time solution aspects, from the source to the presentation to persistence. Through this practical book, you’ll be equipped with a clear understanding of how to solve challenges on your own.We’ll cover topics such as how to set up components, basic executions, integrations, advanced use cases, alerts, and monitoring. You’ll be exposed to the popular tools used in real-time processing today such as Apache Spark, Apache Flink, and Storm. Finally, you will put your knowledge to practical use by implementing all of the techniques in the form of a practical, real-world use case.By the end of this book, you will have a solid understanding of all the aspects of real-time data processing and analytics, and will know how to deploy the solutions in production environments in the best possible manner.

518
Ebook

Practical Site Reliability Engineering. Automate the process of designing, developing, and delivering highly reliable apps and services with SRE

Pethuru Raj Chelliah, Shreyash Naithani, Shailender Singh

Site reliability engineering (SRE) is being touted as the most competent paradigm in establishing and ensuring next-generation high-quality software solutions.This book starts by introducing you to the SRE paradigm and covers the need for highly reliable IT platforms and infrastructures. As you make your way through the next set of chapters, you will learn to develop microservices using Spring Boot and make use of RESTful frameworks. You will also learn about GitHub for deployment, containerization, and Docker containers. Practical Site Reliability Engineering teaches you to set up and sustain containerized cloud environments, and also covers architectural and design patterns and reliability implementation techniques such as reactive programming, and languages such as Ballerina and Rust. In the concluding chapters, you will get well-versed with service mesh solutions such as Istio and Linkerd, and understand service resilience test practices, API gateways, and edge/fog computing.By the end of this book, you will have gained experience on working with SRE concepts and be able to deliver highly reliable apps and services.

519
Ebook

Practical Threat Intelligence and Data-Driven Threat Hunting. A hands-on guide to threat hunting with the ATT&CK™ Framework and open source tools

Valentina Costa-Gazcón

Threat hunting (TH) provides cybersecurity analysts and enterprises with the opportunity to proactively defend themselves by getting ahead of threats before they can cause major damage to their business.This book is not only an introduction for those who don’t know much about the cyber threat intelligence (CTI) and TH world, but also a guide for those with more advanced knowledge of other cybersecurity fields who are looking to implement a TH program from scratch.You will start by exploring what threat intelligence is and how it can be used to detect and prevent cyber threats. As you progress, you’ll learn how to collect data, along with understanding it by developing data models. The book will also show you how to set up an environment for TH using open source tools. Later, you will focus on how to plan a hunt with practical examples, before going on to explore the MITRE ATT&CK framework.By the end of this book, you’ll have the skills you need to be able to carry out effective hunts in your own environment.

520
Ebook

Practical Time Series Analysis. Master Time Series Data Processing, Visualization, and Modeling using Python

Avishek Pal, PKS Prakash

Time Series Analysis allows us to analyze data which is generated over a period of time and has sequential interdependencies between the observations. This book describes special mathematical tricks and techniques which are geared towards exploring the internal structures of time series data and generating powerful descriptive and predictive insights. Also, the book is full of real-life examples of time series and their analyses using cutting-edge solutions developed in Python. The book starts with descriptive analysis to create insightful visualizations of internal structures such as trend, seasonality, and autocorrelation. Next, the statistical methods of dealing with autocorrelation and non-stationary time series are described. This is followed by exponential smoothing to produce meaningful insights from noisy time series data. At this point, we shift focus towards predictive analysis and introduce autoregressive models such as ARMA and ARIMA for time series forecasting. Later, powerful deep learning methods are presented, to develop accurate forecasting models for complex time series, and under the availability of little domain knowledge. All the topics are illustrated with real-life problem scenarios and their solutions by best-practice implementations in Python.The book concludes with the Appendix, with a brief discussion of programming and solving data science problems using Python.

521
Ebook

Principles of Data Science. A beginner's guide to essential math and coding skills for data fluency and machine learning - Third Edition

Sinan Ozdemir

Principles of Data Science bridges mathematics, programming, and business analysis, empowering you to confidently pose and address complex data questions and construct effective machine learning pipelines. This book will equip you with the tools to transform abstract concepts and raw statistics into actionable insights.Starting with cleaning and preparation, you’ll explore effective data mining strategies and techniques before moving on to building a holistic picture of how every piece of the data science puzzle fits together. Throughout the book, you’ll discover statistical models with which you can control and navigate even the densest or the sparsest of datasets and learn how to create powerful visualizations that communicate the stories hidden in your data.With a focus on application, this edition covers advanced transfer learning and pre-trained models for NLP and vision tasks. You’ll get to grips with advanced techniques for mitigating algorithmic bias in data as well as models and addressing model and data drift. Finally, you’ll explore medium-level data governance, including data provenance, privacy, and deletion request handling.By the end of this data science book, you'll have learned the fundamentals of computational mathematics and statistics, all while navigating the intricacies of modern ML and large pre-trained models like GPT and BERT.

522
Ebook

Principles of Data Science. Mathematical techniques and theory to succeed in data-driven industries

Sinan Ozdemir

Need to turn your skills at programming into effective data science skills? Principles of Data Science is created to help you join the dots between mathematics, programming, and business analysis. With this book, you’ll feel confident about asking—and answering—complex and sophisticated questions of your data to move from abstract and raw statistics to actionable ideas.With a unique approach that bridges the gap between mathematics and computer science, this books takes you through the entire data science pipeline. Beginning with cleaning and preparing data, and effective data mining strategies and techniques, you’ll move on to build a comprehensive picture of how every piece of the data science puzzle fits together. Learn the fundamentals of computational mathematics and statistics, as well as some pseudocode being used today by data scientists and analysts. You’ll get to grips with machine learning, discover the statistical models that help you take control and navigate even the densest datasets, and find out how to create powerful visualizations that communicate what your data means.

523
Ebook

Principles of Data Science. Understand, analyze, and predict data using Machine Learning concepts and tools - Second Edition

Sinan Ozdemir, Sunil Kakade, Marco Tibaldeschi

Need to turn programming skills into effective data science skills? This book helps you connect mathematics, programming, and business analysis. You’ll feel confident asking—and answering—complex, sophisticated questions of your data, making abstract and raw statistics into actionable ideas.Going through the data science pipeline, you'll clean and prepare data and learn effective data mining strategies and techniques to gain a comprehensive view of how the data science puzzle fits together. You’ll learn fundamentals of computational mathematics and statistics and pseudo-code used by data scientists and analysts. You’ll learn machine learning, discovering statistical models that help control and navigate even the densest datasets, and learn powerful visualizations that communicate what your data means.

524
Ebook

Puppet 3 Cookbook. An essential book if you have responsibility for servers. Real-world examples and code will give you Puppet expertise, allowing more control over servers, cloud computing, and desktops. A time-saving, career-enhancing tutorial - Second Edition

John Arundel

A revolution is happening in web operations. Configuration management tools can build servers in seconds, and automate your entire network. Tools like Puppet are essential to taking full advantage of the power of cloud computing, and building reliable, scalable, secure, high-performance systems. More and more systems administration and IT jobs require some knowledge of configuration management, and specifically Puppet.Puppet 3 Cookbook takes you beyond the basics to explore the full power of Puppet, showing you in detail how to tackle a variety of real-world problems and applications. At every step it shows you exactly what commands you need to type, and includes full code samples for every recipe.The book takes the reader from a basic knowledge of Puppet to a complete and expert understanding of Puppet's latest and most advanced features, community best practices, writing great manifests, scaling and performance, and extending Puppet by adding your own providers and resources. It starts with guidance on how to set up and expand your Puppet infrastructure, then progresses through detailed information on the language and features, external tools, reporting, monitoring, and troubleshooting, and concludes with many specific recipes for managing popular applications.The book includes real examples from production systems and techniques that are in use in some of the world's largest Puppet installations, including a distributed Puppet architecture based on the Git version control system. You'll be introduced to powerful tools that work with Puppet such as Hiera. The book also explains managing Ruby applications and MySQL databases, building web servers, load balancers, high-availability systems with Heartbeat, and many other state-of-the-art techniques

525
Ebook

PySpark Cookbook. Over 60 recipes for implementing big data processing and analytics using Apache Spark and Python

Denny Lee, Tomasz Drabas

Apache Spark is an open source framework for efficient cluster computing with a strong interface for data parallelism and fault tolerance. The PySpark Cookbook presents effective and time-saving recipes for leveraging the power of Python and putting it to use in the Spark ecosystem.You’ll start by learning the Apache Spark architecture and how to set up a Python environment for Spark. You’ll then get familiar with the modules available in PySpark and start using them effortlessly. In addition to this, you’ll discover how to abstract data with RDDs and DataFrames, and understand the streaming capabilities of PySpark. You’ll then move on to using ML and MLlib in order to solve any problems related to the machine learning capabilities of PySpark and use GraphFrames to solve graph-processing problems. Finally, you will explore how to deploy your applications to the cloud using the spark-submit command.By the end of this book, you will be able to use the Python API for Apache Spark to solve any problems associated with building data-intensive applications.

526
Ebook

Python 3 and Data Analytics Pocket Primer. A Quick Guide to NumPy, Pandas, and Data Visualization

Mercury Learning and Information, Oswald Campesato

This book, part of the best-selling Pocket Primer series, introduces readers to the fundamental concepts of data analytics using Python 3. The course begins with a concise introduction to Python, covering essential programming constructs and data manipulation techniques. This foundation sets the stage for deeper dives into data analytics, emphasizing the importance of data cleaning, a critical step in any data analysis process.Following the Python basics, the course explores powerful libraries such as NumPy and Pandas for efficient data handling and manipulation. It then delves into statistical concepts, providing the necessary background for understanding data distributions and analytical methods. The course culminates in data visualization techniques using Matplotlib and Seaborn, demonstrating how to effectively communicate insights through graphical representations.Throughout the course, numerous code samples and practical examples are provided, reinforcing learning and offering hands-on experience. Companion files with source code and figures are available online, supporting the learning journey. This comprehensive guide equips both beginners and seasoned professionals with the skills needed to excel in data analytics.

527
Ebook

Python 3 and Data Visualization. Mastering Graphics and Data Manipulation with Python

Mercury Learning and Information, Oswald Campesato

Python 3 and Data Visualization provides an in-depth exploration of Python 3 programming and data visualization techniques. The course begins with an introduction to Python, covering essential topics from basic data types and loops to advanced constructs such as dictionaries and matrices. This foundation prepares readers for the next section, which focuses on NumPy and its powerful array operations, seamlessly leading into data visualization using prominent libraries like Matplotlib.Chapter 6 delves into Seaborn's rich visualization tools, providing insights into datasets like Iris and Titanic. The appendix covers additional visualization tools and techniques, including SVG graphics and D3 for dynamic visualizations. The companion files include numerous Python code samples and figures, enhancing the learning experience.From foundational Python concepts to advanced data visualization techniques, this course serves as a comprehensive resource for both beginners and seasoned professionals, equipping them with the necessary skills to effectively visualize data.

528
Ebook

Python 3 and Machine Learning Using ChatGPT / GPT-4. Harness the Power of Python, Machine Learning, and Generative AI

Mercury Learning and Information, Oswald Campesato

This book bridges the gap between theoretical knowledge and practical application in Python programming, machine learning, and using ChatGPT-4 in data science. It starts with an introduction to Pandas for data manipulation and analysis. The book then explores various machine learning classifiers, from kNN to SVMs. Later chapters cover GPT-4's capabilities, enhancing linear regression analysis, and using ChatGPT in data visualization, including AI apps, GANs, and DALL-E.The journey begins with mastering Pandas and machine learning fundamentals. It progresses to applying GPT-4 in linear regression and machine learning classifiers. The final chapters focus on using ChatGPT for data visualization, making complex results accessible and understandable.Understanding these concepts is crucial for modern data scientists. This book transitions readers from basic Python programming to advanced applications of ChatGPT-4 in data science. Companion files with source code, datasets, and figures enhance learning, making this an essential resource for mastering Python, machine learning, and AI-driven data visualization.