In today’s blog, I will be introducing you to a new open-source distributed SQL query engine, Presto. It is designed for running SQL queries over Big Data (petabytes of data). It was designed by the people at Facebook. Quoting its formal definition:
“Presto is an open-source distributed SQL query engine for running interactive analytic queries against data sources of all sizes ranging from gigabytes to petabytes.”
Th folks at Facebook are at it again. They build a SQL engine especially for analytical work, this is not an online transaction processing (OLTP) engine. It’s an engine for ad-hoc queries across SQL/NoSQL databases distributed all over the place.
They use connectors for MySQL, Hadoop/Hive, MongoDB, Postgres and more. Missing are some of standards like Microsoft SQL and Teradata. However, this won’t be the story for long.
Presto is in its open source newness but you should take a look at the documentation to really appreciate the power of this new thing.
via An Introduction to Presto — DZone Big Data Zone
Check out the champion blog at HTTP://GeekMustHave.COM
This a new channel and the Geek needs your help, please click on the subscribe button,
watch the videos and click on the like button, leave comments and questions.
The Geek is busy learning and building stuff, so don’t be upset if the response isn’t immediate.
Thank you and now ….“Let’s build something…”
Back in the days time four, I was a hierarchal database specialist. Anyone remember hierarchal, flat files or indexed sequential access (ISAM)? When relational DBMS took off, back in days times two, I became a relational convert and preached the benefits of SQL in all of it’s English like glory. Then I suffered through DDL hell and mapping madness, but I stayed true to the cause. Well, now I’m a Not Only SQL (NoSQL) first level apprentice and speak the Mongo and Hadoop chapters of the Database bible. What’s worse is that I’m a flip-flopper between the relational and document worlds. What I’ve learned is you don’t have to pick the Right side or the Left side, you will end up eating both sides eventually.
Take some time to learn the NoSQL side of the database house. Install Mongo on your system, get a few sample NoSQL databases, use the Mongo command line just to learn some of the syntax. There are a ton of YouTube videos and Udemy courses on this stuff. Once you’ve created your first table (Document) without the Create Table… you might like it. Mikey did, he likes everything.
This article from Lisa Vass is a good introduction as to why and when you should choose. It’s not surprising that relational databases are not doing as much of the lifting as they have in the past.
I suggest read a book before you dive headlong into the next shiny bright thing in databases. This book appears to be a good read, I’ve downloaded it to my Kindle and started it. So far it is very interesting and easy to read. If you don’t have any background in databases I would suggest reading an entry level book first. This review by I Programmer is a much better review than I would ever write, give it a quick2-minutee read. We are truly in the “next Generation” of database evolution, you need to pay close attention to what going on right now, or be stuck in the dBase/Paradox past again.
I highly recommend reading this article on NoSQL databases. It is the best description of the different types of NoSQL that exist. There are four big NoSQL types: key-value store, document store, column-oriented database, and graph database. Each type solves a problem that can’t be solved with relational databases. Actual implementations are often combinations of these. OrientDB, for example, is a multi-model database, combining NoSQL types. OrientDB is graph database where each node is a document.
The book is a pre-order from Amazon and is expensive, but I think it will be a good read and reference. I placed my order already.