Big data

609
Wird geladen...
E-BOOK

Mastering Apache Spark 2.x. Advanced techniques in complex Big Data processing, streaming analytics and machine learning - Second Edition

Romeo Kienzler

Apache Spark is an in-memory, cluster-based Big Data processing system that provides a wide range of functionalities such as graph processing, machine learning, stream processing, and more. This book will take your knowledge of Apache Spark to the next level by teaching you how to expand Spark’s functionality and build your data flows and machine/deep learning programs on top of the platform.The book starts with a quick overview of the Apache Spark ecosystem, and introduces you to the new features and capabilities in Apache Spark 2.x. You will then work with the different modules in Apache Spark such as interactive querying with Spark SQL, using DataFrames and DataSets effectively, streaming analytics with Spark Streaming, and performing machine learning and deep learning on Spark using MLlib and external tools such as H20 and Deeplearning4j. The book also contains chapters on efficient graph processing, memory management and using Apache Spark on the cloud.By the end of this book, you will have all the necessary information to master Apache Spark, and use it efficiently for Big Data processing and analytics.

610
Wird geladen...
E-BOOK

Mastering Apache Storm. Real-time big data streaming using Kafka, Hbase and Redis

Ankit Jain

Apache Storm is a real-time Big Data processing framework that processes large amounts of data reliably, guaranteeing that every message will be processed. Storm allows you to scale your data as it grows, making it an excellent platform to solve your big data problems. This extensive guide will help you understand right from the basics to the advanced topics of Storm.The book begins with a detailed introduction to real-time processing and where Storm fits in to solve these problems. You’ll get an understanding of deploying Storm on clusters by writing a basic Storm Hello World example. Next we’ll introduce you to Trident and you’ll get a clear understanding of how you can develop and deploy a trident topology. We cover topics such as monitoring, Storm Parallelism, scheduler and log processing, in a very easy to understand manner. You will also learn how to integrate Storm with other well-known Big Data technologies such as HBase, Redis, Kafka, and Hadoop to realize the full potential of Storm.With real-world examples and clear explanations, this book will ensure you will have a thorough mastery of Apache Storm. You will be able to use this knowledge to develop efficient, distributed real-time applications to cater to your business needs.

611
Wird geladen...
E-BOOK

Mastering Arduino. A project-based approach to electronics, circuits, and programming

Jon Hoffman

Mastering Arduino is an all-in-one guide to getting the most out of your Arduino. This practical, no-nonsense guide teaches you all of the electronics and programming skills that you need to create advanced Arduino projects. This book is packed full of real-world projects for you to practice on, bringing all of the knowledge in the book together and giving you the skills to build your own robot from the examples in this book. The final two chapters discuss wireless technologies and how they can be used in your projects. The book begins with the basics of electronics, making sure that you understand components, circuits, and prototyping before moving on. It then performs the same function for code, getting you into the Arduino IDE and showing you how to connect the Arduino to a computer and run simple projects on your Arduino.Once the basics are out of the way, the next 10 chapters of the book focus on small projects centered around particular components, such as LCD displays, stepper motors, or voice synthesizers. Each of these chapters will get you familiar with the technology involved, how to build with it, how to program it, and how it can be used in your own projects.

612
Wird geladen...
E-BOOK

Mastering Azure Machine Learning. Execute large-scale end-to-end machine learning with Azure - Second Edition

Christoph Körner, Marcel Alsdorf

Azure Machine Learning is a cloud service for accelerating and managing the machine learning (ML) project life cycle that ML professionals, data scientists, and engineers can use in their day-to-day workflows. This book covers the end-to-end ML process using Microsoft Azure Machine Learning, including data preparation, performing and logging ML training runs, designing training and deployment pipelines, and managing these pipelines via MLOps.The first section shows you how to set up an Azure Machine Learning workspace; ingest and version datasets; as well as preprocess, label, and enrich these datasets for training. In the next two sections, you'll discover how to enrich and train ML models for embedding, classification, and regression. You'll explore advanced NLP techniques, traditional ML models such as boosted trees, modern deep neural networks, recommendation systems, reinforcement learning, and complex distributed ML training techniques - all using Azure Machine Learning.The last section will teach you how to deploy the trained models as a batch pipeline or real-time scoring service using Docker, Azure Machine Learning clusters, Azure Kubernetes Services, and alternative deployment targets.By the end of this book, you’ll be able to combine all the steps you’ve learned by building an MLOps pipeline.

613
Wird geladen...
E-BOOK

Mastering Azure Machine Learning. Perform large-scale end-to-end advanced machine learning in the cloud with Microsoft Azure Machine Learning

Christoph Körner, Kaijisse Waaijer

The increase being seen in data volume today requires distributed systems, powerful algorithms, and scalable cloud infrastructure to compute insights and train and deploy machine learning (ML) models. This book will help you improve your knowledge of building ML models using Azure and end-to-end ML pipelines on the cloud.The book starts with an overview of an end-to-end ML project and a guide on how to choose the right Azure service for different ML tasks. It then focuses on Azure Machine Learning and takes you through the process of data experimentation, data preparation, and feature engineering using Azure Machine Learning and Python. You'll learn advanced feature extraction techniques using natural language processing (NLP), classical ML techniques, and the secrets of both a great recommendation engine and a performant computer vision model using deep learning methods. You'll also explore how to train, optimize, and tune models using Azure Automated Machine Learning and HyperDrive, and perform distributed training on Azure. Then, you'll learn different deployment and monitoring techniques using Azure Kubernetes Services with Azure Machine Learning, along with the basics of MLOps—DevOps for ML to automate your ML process as CI/CD pipeline.By the end of this book, you'll have mastered Azure Machine Learning and be able to confidently design, build and operate scalable ML pipelines in Azure.

614
Wird geladen...
E-BOOK

Mastering Blockchain. Deeper insights into decentralization, cryptography, Bitcoin, and popular Blockchain frameworks

Imran Bashir

Blockchain is a distributed database that enables permanent, transparent, and secure storage of data. The blockchain technology is the backbone of cryptocurrency – in fact, it’s the shared public ledger upon which the entire Bitcoin network relies – and it’s gaining popularity with people who work in finance, government, and the arts.Blockhchain technology uses cryptography to keep data secure. This book gives a detailed description of this leading technology and its implementation in the real world.This book begins with the technical foundations of blockchain, teaching you the fundamentals of cryptography and how it keeps data secure. You will learn about the mechanisms behind cryptocurrencies and how to develop applications using Ethereum, a decentralized virtual machine. You will explore different blockchain solutions and get an exclusive preview into Hyperledger, an upcoming blockchain solution from IBM and the Linux Foundation. You will also be shown how to implement blockchain beyond currencies, scability with blockchain, and the future scope of this fascinating and powerful technology.

615
Wird geladen...
E-BOOK

Mastering Business Intelligence with MicroStrategy. Master Business Intelligence with Microstrategy 10

Dmitry Anoshin, Himani Rana, Ning Ma, Neil...

Business intelligence is becoming more important by the day, with cloud offerings and mobile devices gaining wider acceptance and achieving better market penetration. MicroStrategy Reporting Suite is an absolute leader in the BI market and offers rich capabilities from basic data visualizations to predictive analytics. It lets you various delivery methods such as the Web, desktops, and mobiles.Using real-world BI scenarios, this book helps you to implement Business Analytics solutions in big e-commerce companies. It kicks off with MicroStrategy 10 features and then covers schema design models and techniques. Building upon your existing knowledge, the book will teach you advanced techniques for building documents and dashboards. It further teaches various graphical techniques for presenting data for analysis using maps, graphs, and advanced charts. Although MicroStrategy has rich functionality, the book will show how to customize it in order to meet your business requirements. You will also become familiar with the native analytical functions that will help you to maximize the impact of BI solutions with powerful predictive analytics. Furthermore, the book will focus on MicroStrategy Mobile Analytics along with data discovery and desktop capabilities such as connecting various data sources and building interactive dashboards. The book will also uncover best practices, troubleshooting techniques for MicroStrategy system administration, and also security and authentication techniques. Lastly, you will learn to use Hadoop for MicroStrategy reporting.By the end of the book, you will become proficient in evaluating any BI software in order to choose the best one that meets all business requirements.

617
Wird geladen...
E-BOOK
619
Wird geladen...
E-BOOK

Mastering Elastic Stack. Dive into data analysis with a pursuit of mastering ELK Stack on real-world scenarios

Ravi Kumar Gupta, Yuvraj Gupta

Even structured data is useless if it can’t help you to take strategic decisions and improve existing system. If you love to play with data, or your job requires you to process custom log formats, design a scalable analysis system, and manage logs to do real-time data analysis, this book is your one-stop solution. By combining the massively popular Elasticsearch, Logstash, Beats, and Kibana, elastic.co has advanced the end-to-end stack that delivers actionable insights in real time from almost any type of structured or unstructured data source. If your job requires you to process custom log formats, design a scalable analysis system, explore a variety of data, and manage logs, this book is your one-stop solution. You will learn how to create real-time dashboards and how to manage the life cycle of logs in detail through real-life scenarios.This book brushes up your basic knowledge on implementing the Elastic Stack and then dives deeper into complex and advanced implementations of the Elastic Stack. We’ll help you to solve data analytics challenges using the Elastic Stack and provide practical steps on centralized logging and real-time analytics with the Elastic Stack in production. You will get to grip with advanced techniques for log analysis and visualization. Newly announced features such as Beats and X-Pack are also covered in detail with examples.Toward the end, you will see how to use the Elastic stack for real-world case studies and we’ll show you some best practices and troubleshooting techniques for the Elastic Stack.

620
Wird geladen...
E-BOOK

Mastering Elasticsearch 5.x. Master the intricacies of Elasticsearch 5 and use it to create flexible and scalable search solutions - Third Edition

Bharvi Dixit

Elasticsearch is a modern, fast, distributed, scalable, fault tolerant, and open source search and analytics engine. Elasticsearch leverages the capabilities of Apache Lucene, and provides a new level of control over how you can index and search even huge sets of data.This book will give you a brief recap of the basics and also introduce you to the new features of Elasticsearch 5. We will guide you through the intermediate and advanced functionalities of Elasticsearch, such as querying, indexing, searching, and modifying data. We’ll also explore advanced concepts, including aggregation, index control, sharding, replication, and clustering. We’ll show you the modules of monitoring and administration available in Elasticsearch, and will also cover backup and recovery. You will get an understanding of how you can scale your Elasticsearch cluster to contextualize it and improve its performance. We’ll also show you how you can create your own analysis plugin in Elasticsearch. By the end of the book, you will have all the knowledge necessary to master Elasticsearch and put it to efficient use.

621
Wird geladen...
E-BOOK

Mastering Geospatial Analysis with Python. Explore GIS processing and learn to work with GeoDjango, CARTOframes and MapboxGL-Jupyter

Silas Toms, Paul Crickard, Eric van Rees

Python comes with a host of open source libraries and tools that help you work on professional geoprocessing tasks without investing in expensive tools. This book will introduce Python developers, both new and experienced, to a variety of new code libraries that have been developed to perform geospatial analysis, statistical analysis, and data management. This book will use examples and code snippets that will help explain how Python 3 differs from Python 2, and how these new code libraries can be used to solve age-old problems in geospatial analysis.You will begin by understanding what geoprocessing is and explore the tools and libraries that Python 3 offers. You will then learn to use Python code libraries to read and write geospatial data. You will then learn to perform geospatial queries within databases and learn PyQGIS to automate analysis within the QGIS mapping suite. Moving forward, you will explore the newly released ArcGIS API for Python and ArcGIS Online to perform geospatial analysis and create ArcGIS Online web maps. Further, you will deep dive into Python Geospatial web frameworks and learn to create a geospatial REST API.

622
Wird geladen...
E-BOOK

Mastering Hadoop 3. Big data processing at scale to unlock unique business insights

Chanchal Singh, Manish Kumar

Apache Hadoop is one of the most popular big data solutions for distributed storage and for processing large chunks of data. With Hadoop 3, Apache promises to provide a high-performance, more fault-tolerant, and highly efficient big data processing platform, with a focus on improved scalability and increased efficiency.With this guide, you’ll understand advanced concepts of the Hadoop ecosystem tool. You’ll learn how Hadoop works internally, study advanced concepts of different ecosystem tools, discover solutions to real-world use cases, and understand how to secure your cluster. It will then walk you through HDFS, YARN, MapReduce, and Hadoop 3 concepts. You’ll be able to address common challenges like using Kafka efficiently, designing low latency, reliable message delivery Kafka systems, and handling high data volumes. As you advance, you’ll discover how to address major challenges when building an enterprise-grade messaging system, and how to use different stream processing systems along with Kafka to fulfil your enterprise goals.By the end of this book, you’ll have a complete understanding of how components in the Hadoop ecosystem are effectively integrated to implement a fast and reliable data pipeline, and you’ll be equipped to tackle a range of real-world problems in data pipelines.

624
Wird geladen...
E-BOOK

Mastering Java for Data Science. Analytics and more for production-ready applications

Alexey Grigorev

Java is the most popular programming language, according to the TIOBE index, and it is a typical choice for running production systems in many companies, both in the startup world and among large enterprises.Not surprisingly, it is also a common choice for creating data science applications: it is fast and has a great set of data processing tools, both built-in and external. What is more, choosing Java for data science allows you to easily integrate solutions with existing software, and bring data science into production with less effort.This book will teach you how to create data science applications with Java. First, we will revise the most important things when starting a data science application, and then brush up the basics of Java and machine learning before diving into more advanced topics. We start by going over the existing libraries for data processing and libraries with machine learning algorithms. After that, we cover topics such as classification and regression, dimensionality reduction and clustering, information retrieval and natural language processing, and deep learning and big data.Finally, we finish the book by talking about the ways to deploy the model and evaluate it in production settings.