Data and information are fuel of this new age where powerful analytics algorithms burn this fuel to generate decisions that are expected to create a smarter and more efficient world for all of us to live in. This new area of technology has been defined as Big Data Science and Analytics, and the industrial and academic communities are realizing this as a competitive technology that can generate significant new wealth and opportunity.
Big data is defined as collections of datasets whose volume, velocity or variety is so large that it is difficult to store, manage, process and analyze the data using traditional databases and data processing tools. Big data science and analytics deals with collection, storage, processing and analysis of massive-scale data. Industry surveys, by Gartner and e-Skills, for instance, predict that there will be over 2 million job openings for engineers and scientists trained in the area of data science and analytics alone, and that the job market is in this area is growing at a 150 percent year-over-year growth rate.
We have written this textbook, as part of our expanding A Hands-On Approach(TM) series, to meet this need at colleges and universities, and also for big data service providers who may be interested in offering a broader perspective of this emerging field to accompany their customer and developer training programs. The typical reader is expected to have completed a couple of courses in programming using traditional high-level languages at the college-level, and is either a senior or a beginning graduate student in one of the science, technology, engineering or mathematics (STEM) fields. An accompanying website for this book contains additional support for instruction and learning (www.big-data-analytics-book.com)
The book is organized into three main parts, comprising a total of twelve chapters. Part I provides an introduction to big data, applications of big data, and big data science and analytics patterns and architectures. A novel data science and analytics application system design methodology is proposed and its realization through use of open-source big data frameworks is described. This methodology describes big data analytics applications as realization of the proposed Alpha, Beta, Gamma and Delta models, that comprise tools and frameworks for collecting and ingesting data from various sources into the big data analytics infrastructure, incorporating distributed filesystems and non-relational (NoSQL) databases for data storage, and processing frameworks for batch and real-time analytics. This new methodology forms the pedagogical foundation of this book.
Part II introduces the reader to various tools and frameworks for big data analytics, and the architectural and programming aspects of these frameworks, with examples in Python. We describe Publish-Subscribe messaging frameworks (Kafka & Kinesis), Source-Sink connectors (Flume), Database Connectors (Sqoop), Messaging Queues (RabbitMQ, ZeroMQ, RestMQ, Amazon SQS) and custom REST, WebSocket and MQTT-based connectors. The reader is introduced to data storage, batch and real-time analysis, and interactive querying frameworks including HDFS, Hadoop, MapReduce, YARN, Pig, Oozie, Spark, Solr, HBase, Storm, Spark Streaming, Spark SQL, Hive, Amazon Redshift and Google BigQuery. Also described are serving databases (MySQL, Amazon DynamoDB, Cassandra, MongoDB) and the Django Python web framework.
Part III introduces the reader to various machine learning algorithms with examples using the Spark MLlib and H2O frameworks, and visualizations using frameworks such as Lightning, Pygal and Seaborn.
A recent industry report from Gartner points out that choices related to cloud computing at enterprises have changed from if to how to build, deploy, consume, manage, secure and integrate cloud services into their operations. The cloud solutions architect is the person who defines the enterprise cloud strategy from a technical point of view and must take responsibility for rolling out these cloud services.
Cloud computing is a transformative paradigm that enables scalable, convenient, on-demand access to a shared pool of configurable computing and networking resources, for efficiently delivering applications and services over the Internet. Amazon Web Services (AWS), a leading provider of cloud platforms and services, defines a cloud solutions architect as one who can provide solution plans for the best architectural practices for cloud applications, can design and deploy highly scalable and fault-tolerant services, can assist in lifting legacy applications and shifting them to the cloud, and can identify and plan for data entry and exit from the cloud platform, choose suitable cloud services based on data, compute, and security requirements. Further, the cloud solutions architect also ensures that enterprise offerings conform to sound principles, such as AWS Well-Architected Framework (WAF) for cloud applications and services.
We have written this textbook, as part of our expanding A Hands-On Approach(TM) series, to meet this need at colleges and universities. This book is written as a textbook for training the next generation of cloud solutions architects for educational programs at colleges and universities, and also accompanying cloud certification programs where students would be interested in obtaining valuable hands-on skills on actual cloud platforms to further develop their knowledge and competency base.
The typical reader is expected to have completed a couple of courses in programming using traditional high-level languages at the college level, and is either a senior or a beginning graduate student in one of the science, technology, engineering or mathematics (STEM) fields. The reader is provided the necessary guidance and knowledge to develop working code for real-world cloud computing applications. Concurrent development of practical applications that accompanies traditional instructional material within the book further enhances the learning process, in our opinion. Furthermore, an accompanying website for this book (http: //hands-on-books-series.com) contains additional support for instruction and learning.
Companies today are undergoing digital transformation to build agile IT infrastructures that not only provide traditional IT support functions, but also enable innovation in business operations and planning. Rather than custom solutions that lock them into legacy systems, companies want flexible and cost-effective solutions that leverage the cloud's potential. Migrating to the cloud opens exciting new opportunities. Microservices architecture offers a way to realize complex, cloud-native systems by decomposing functionality into numerous independent services that work together. This reduces overall complexity, allows quicker changes to meet shifting business needs, and enables efficient scaling for performance and reliability. Microservices are especially well-suited for cloud platforms and facilitate reorganization of development and operations (DevOps) methods to suit faster delivery schedules.
However, a gap exists between academic coverage of microservices patterns and actual deployment of microservices-based solutions on real cloud platforms. Many excellent resources focus on architectural principles but do not provide clear guidance on implementation. Conversely, books on specific cloud providers emphasize hands-on skills but fail to provide foundational knowledge to evaluate solutions properly or transfer learning across platforms. This textbook bridges the gap by enabling readers to rapidly grasp microservices concepts and then deploy practical microservices applications on real cloud platforms. With hundreds of figures and tested code samples, we offer a rigorous, hype-free guide to developing robust cloud-native apps.
The book meets the need for educational programs at colleges and universities to train the next generation of cloud solutions architects and DevOps engineers. It accompanies cloud computing curricula and certification programs where students seek valuable hands-on experience on commercial cloud platforms to complement conceptual knowledge. The typical reader is a senior undergraduate or beginning graduate student in science, technology, engineering, or mathematics (STEM) fields who has completed introductory programming courses. The book provides the necessary guidance and knowledge for readers to develop working code for cloud-based microservices applications. We believe augmenting traditional classroom learning with practical coding exercises significantly enhances the learning process. Additional student support resources are available on the book's companion website.
The textbook comprises twelve chapters delivering in-depth coverage of key concepts, technologies, and architectural patterns for cloud-based microservices. Our competency development approach aims to equip readers with practical skills rather than dwell on theory covered adequately elsewhere. We offer a book that allows readers to quickly understand what microservices are and then deploy them on real cloud platforms, while providing the necessary technical background to guide them to improve their understanding and competency in evaluating and using cloud-based platforms.
In the US, the services sector provides employment to about 100 million, while the manufacturing sector provides employment to about 20 million. These sectors are highly automated, and driven by sophisticated business processes forming an integral part of the digital economy. While the applications themselves may be distributed over the Internet in time and space, the core business, regulatory, and financial aspects of the digital economy are still centralized, with the need for centralized agencies (such as banks, customs authorities, and tax agencies) to authenticate and settle payments and transactions. These centralized services often are manual, difficult to automate, and represent a bottleneck to facilitating a frictionless digital economy. The next revolutionary step in the services and manufacturing economy of the future is the development of automated distributed applications that do not depend on these traditional centralized agencies for controlling, facilitating and settling multi-party transactions that may themselves be subject to complex contractual constraints. The blockchain technology is an integral part of these next steps that promises a smart new world of automation of complex services and manufacturing processes. Blockchain is a distributed and public ledger which maintains records of all the transactions on a blockchain network comprising suppliers of products and services and consumers. With the blockchain's ability to establish trust in a peer-to-peer network through a distributed consensus mechanism rather than relying on a powerful centralized authority, the technology is being seen by the industry experts as one of the greatest innovations since the invention of the Internet. As per Santander, blockchain technologies can reduce annual costs for financial firms by $20b by streamlining processes and improving efficiency. In addition, investment and spending on blockchain technology is expected to increase at a compound annual growth rate (CAGR) of 52% through 2019. We have written this textbook, as part of our expanding A Hands-On Approach(TM) series, to serve as a textbook for senior-level and graduate-level courses on financial and regulation technologies, business analytics, Internet of Things, and cryptocurrency. This book is also written for use within industries in the FinTech and RegTech space that may be interested in rolling out products and services that utilize this new area of technology. An accompanying website for this book contains additional support for instruction and learning (www.blockchain-book.com). The book is organized into three main parts, comprising a total of ten chapters. Part I provides an introduction to blockchain concepts, design patterns, and architectures for blockchain applications. A blockchain stack comprising a decentralized computation platform, a decentralized messaging platform, and a decentralized storage platform is described. Part II introduces the readers to tools and platforms for blockchain, such as Geth, PyEthApp, TestRPC, Mist Ethereum Wallet, MetaMask, Web3 JavaScript API and the Truffle Dapp framework. Implementation examples of various smart contracts and decentralized applications (Dapps) are provided. Part III focuses on advanced topics such as the security and scalability related challenges for the blockchain platforms.