Big data

121
Ebook

Jupyter for Data Science. Exploratory analysis, statistical modeling, machine learning, and data visualization with Jupyter

Dan Toomey

Jupyter Notebook is a web-based environment that enables interactive computing in notebook documents. It allows you to create documents that contain live code, equations, and visualizations. This book is a comprehensive guide to getting started with data science using the popular Jupyter notebook. If you are familiar with Jupyter notebook and want to learn how to use its capabilities to perform various data science tasks, this is the book for you! From data exploration to visualization, this book will take you through every step of the way in implementing an effective data science pipeline using Jupyter. You will also see how you can utilize Jupyter's features to share your documents and codes with your colleagues. The book also explains how Python 3, R, and Julia can be integrated with Jupyter for various data science tasks.By the end of this book, you will comfortably leverage the power of Jupyter to perform various tasks in data science successfully.

122
Ebook

Kibana 7 Quick Start Guide. Visualize your Elasticsearch data with ease

Anurag Srivastava

The Elastic Stack is growing rapidly and, day by day, additional tools are being added to make it more effective. This book endeavors to explain all the important aspects of Kibana, which is essential for utilizing its full potential.This book covers the core concepts of Kibana, with chapters set out in a coherent manner so that readers can advance their learning in a step-by-step manner. The focus is on a practical approach, thereby enabling the reader to apply those examples in real time for a better understanding of the concepts and to provide them with the correct skills in relation to the tool. With its succinct explanations, it is quite easy for a reader to use this book as a reference guide for learning basic to advanced implementations of Kibana. The practical examples, such as the creation of Kibana dashboards from CSV data, application RDBMS data, system metrics data, log file data, APM agents, and search results, can provide readers with a number of different drop-off points from where they can fetch any type of data into Kibana for the purpose of analysis or dashboarding.

123
Ebook

Learn Bitcoin and Blockchain. Understanding blockchain and Bitcoin architecture to build decentralized applications

Kirankalyan Kulkarni

Blockchain is a distributed database that enables permanent, transparent, and secure storage of data. Blockchain technology uses cryptography to keep data secure. Learn Bitcoin and Blockchain is the perfect entry point to the world of decentralized databases.This book will take you on a journey through the blockchain database, followed by advanced implementations of the blockchain concept. You will learn about Bitcoin basics and their technical operations. As you make your way through the book, you will gain insight into this leading technology and its implementation in the real world. You will also cover the technical foundation of blockchain and understand the fundamentals of cryptography and how they keep data secure. In the concluding chapters, you’ll get to grips with the mechanisms behind cryptocurrencies.By the end of this book, you will have learned about decentralized digital money, advanced blockchain concepts, and Bitcoin and blockchain security.

124
Ebook

Learn Quantum Computing with Python and IBM Quantum Experience. A hands-on introduction to quantum computing and writing your own quantum programs with Python

Robert Loredo

IBM Quantum Experience is a platform that enables developers to learn the basics of quantum computing by allowing them to run experiments on a quantum computing simulator and a real quantum computer. This book will explain the basic principles of quantum mechanics, the principles involved in quantum computing, and the implementation of quantum algorithms and experiments on IBM's quantum processors.You will start working with simple programs that illustrate quantum computing principles and slowly work your way up to more complex programs and algorithms that leverage quantum computing. As you build on your knowledge, you’ll understand the functionality of IBM Quantum Experience and the various resources it offers. Furthermore, you’ll not only learn the differences between the various quantum computers but also the various simulators available. Later, you’ll explore the basics of quantum computing, quantum volume, and a few basic algorithms, all while optimally using the resources available on IBM Quantum Experience.By the end of this book, you'll learn how to build quantum programs on your own and have gained practical quantum computing skills that you can apply to your business.

125
Ebook

Learning Alteryx. A beginner's guide to using Alteryx for self-service analytics and business intelligence

Renato Baruti

Alteryx, as a leading data blending and advanced data analytics platform, has taken self-service data analytics to the next level. Companies worldwide often find themselves struggling to prepare and blend massive datasets that are time-consuming for analysts. Alteryx solves these problems with a repeatable workflow designed to quickly clean, prepare, blend, and join your data in a seamless manner. This book will set you on a self-service data analytics journey that will help you create efficient workflows using Alteryx, without any coding involved. It will empower you and your organization to take well-informed decisions with the help of deeper business insights from the data.Starting with the fundamentals of using Alteryx such as data preparation and blending, you will delve into the more advanced concepts such as performing predictive analytics. You will also learn how to use Alteryx’s features to share the insights gained with the relevant decision makers. To ensure consistency, we will be using data from the Healthcare domain throughout this book. The knowledge you gain from this book will guide you to solve real-life problems related to Business Intelligence confidently. Whether you are a novice with Alteryx or an experienced data analyst keen to explore Alteryx’s self-service analytics features, this book will be the perfect companion for you.

126
Ebook

Learning Elastic Stack 6.0. A beginner’s guide to distributed search, analytics, and visualization using Elasticsearch, Logstash and Kibana

Pranav Shukla, Sharath Kumar M N

The Elastic Stack is a powerful combination of tools for distributed search, analytics, logging, and visualization of data from medium to massive data sets. The newly released Elastic Stack 6.0 brings new features and capabilities that empower users to find unique, actionable insights through these techniques. This book will give you a fundamental understanding of what the stack is all about, and how to use it efficiently to build powerful real-time data processing applications.After a quick overview of the newly introduced features in Elastic Stack 6.0, you’ll learn how to set up the stack by installing the tools, and see their basic configurations. Then it shows you how to use Elasticsearch for distributed searching and analytics, along with Logstash for logging, and Kibana for data visualization. It also demonstrates the creation of custom plugins using Kibana and Beats. You’ll find out about Elastic X-Pack, a useful extension for effective security and monitoring. We also provide useful tips on how to use the Elastic Cloud and deploy the Elastic Stack in production environments.On completing this book, you’ll have a solid foundational knowledge of the basic Elastic Stack functionalities. You’ll also have a good understanding of the role of each component in the stack to solve different data processing problems.

127
Ebook

Learning Google BigQuery. A beginner's guide to mining massive datasets through interactive analysis

Thirukkumaran Haridass, Eric Brown

Google BigQuery is a popular cloud data warehouse for large-scale data analytics. This book will serve as a comprehensive guide to mastering BigQuery, and how you can utilize it to quickly and efficiently get useful insights from your Big Data.You will begin with getting a quick overview of the Google Cloud Platform and the various services it supports. Then, you will be introduced to the Google BigQuery API and how it fits within in the framework of GCP. The book covers useful techniques to migrate your existing data from your enterprise to Google BigQuery, as well as readying and optimizing it for analysis. You will perform basic as well as advanced data querying using BigQuery, and connect the results to various third party tools for reporting and visualization purposes such as R and Tableau. If you're looking to implement real-time reporting of your streaming data running in your enterprise, this book will also help you.This book also provides tips, best practices and mistakes to avoid while working with Google BigQuery and services that interact with it. By the time you're done with it, you will have set a solid foundation in working with BigQuery to solve even the trickiest of data problems.

128
Ebook

Learning Informatica PowerCenter 10.x. Enterprise data warehousing and intelligent data centers for efficient data management solutions - Second Edition

Rahul Malewar

Informatica PowerCenter is an industry-leading ETL tool, known for its accelerated data extraction, transformation, and data management strategies. This book will be your quick guide to exploring Informatica PowerCenter’s powerful features such as working on sources, targets, transformations, performance optimization, scheduling, deploying for processing, and managing your data at speed. First, you’ll learn how to install and configure tools. You will learn to implement various data warehouse and ETL concepts, and use PowerCenter 10.x components to build mappings, tasks, workflows, and so on. You will come across features such as transformations, SCD, XML processing, partitioning, constraint-based loading, Incremental aggregation, and many more. Moreover, you’ll also learn to deliver powerful visualizations for data profiling using the advanced monitoring dashboard functionality offered by the new version. Using data transformation technique, performance tuning, and the many new advanced features, this book will help you understand and process data for training or production purposes. The step-by-step approach and adoption of real-time scenarios will guide you through effectively accessing all core functionalities offered by Informatica PowerCenter version 10.x.