In today’s blog, I will be introducing you to a new open-source distributed SQL query engine, Presto. It is designed for running SQL queries over Big Data (petabytes of data). It was designed by the people at Facebook. Quoting its formal definition:
“Presto is an open-source distributed SQL query engine for running interactive analytic queries against data sources of all sizes ranging from gigabytes to petabytes.”
Th folks at Facebook are at it again. They build a SQL engine especially for analytical work, this is not an online transaction processing (OLTP) engine. It’s an engine for ad-hoc queries across SQL/NoSQL databases distributed all over the place.
They use connectors for MySQL, Hadoop/Hive, MongoDB, Postgres and more. Missing are some of standards like Microsoft SQL and Teradata. However, this won’t be the story for long.
Presto is in its open source newness but you should take a look at the documentation to really appreciate the power of this new thing.
via An Introduction to Presto — DZone Big Data Zone
“Backlog” is one of those SCRUMmy terms used to identify features or functions that have been dreamed up or discussed for an application. You collect these ideas into a list which is called the “Backlog”. Then this list is reviewed (Sprint Review) and the ideas are refined (Groomed) and an estimate of effort (Story Points) is assigned to it. Then folks get together and discuss which ones should be done in the next timeframe (Sprint). To collect these ideas some companies use an issue tracking system or an off the shelf ticket system (Atlassian JIRA) and others just use a spreadsheet… gasp.
Sometimes all you need is a simple web application that all the participants can use to enter any ANY of the ideas that came up. Even things like “The buttons should be colored blue.” I needed a simple project to help me learn some technologies that are new to me. Hence the “Backlogger” was born. The whiteboard above shows the original concept.
Technologies used in Backlogger
- Mongo without the headache, neDB
The design requirements were meant to be simple as possible to make this project something that could be done quickly. They also needed to be flexible to allow for better learning.
- Single Page Application
- Open Source
- No user logins, just a password, we are a big happy family
- Self-contained application, no need for outside services or servers
- Mongo database and Mongo queries
- Allow for a maintainable list of people names who contributed ideas to the backlog
- Allow for a maintainable list of functional areas to help groups the ideas
- One time entry of an idea, no editing,
- The editing of an idea will be done during the grooming
- Filters that help find ideas quickly
- Ability to backup and wipe the database (Mongo Documents)
- Simple report that can be printed directly
GeekMustHave would like to thank Phoenix Learning Labs for the resources and funding to do this project. GeekMustHave would also like to thank the MDHHS-DWIP team for the testing and feedback.
Open source, common components
What does it look like?
Backlogger is an Open Source project available on Github.
I’ve used “Backlogger” in one project so far but others who have seen it have expressed some interest in it. That’s another reason why it’s Open Source.
Depending on the feedback I might do additional updates. Maybe I need a “Backlogger” for the “Backlogger”?
Just click the icon to the left, follow the instructions to sign up, and you’ll be added to Manning’s Deal of the Day mailing list. This means that during the month of December you’ll receive a special discount for a particular Manning publication in your email each day, good for that day only. So, remember to check your inbox regularly and act fast if you wish to get the deal!
Microsoft is trying to give the other “on-Demand” Business Intelligence vendors a run for their money. The enterprise folks didn’t give Power-BI a second glance when they leanered the only way to publish was to the cloud. Cloud = Bad. There was talk of SSRS being able to be the portal for all things Power-BI and now some if this has come true.
Microsoft has an Azure appliance that you can “play with” while Microsoft tunes things up. I think this is going to be a look-see for me in the near future. I will post what I find.
I’ve been told by friends to stop being the early adopter for technology thingies. I’m on step 4 of the 12 step Technology Anonymous (TA) program but, I still help trying out the latest database thing, in this case, SQL Server Management Studio 2016. Yea… should have waited, most of my existing PowerSell scripts that backup and sync stuff across my development and staged areas just kinda stopped. I’d like to thank Drew for his post about this! Helped me get my SQLFoo back in the groove again. I promise I will not miss another TA meeting.
I suggest read a book before you dive headlong into the next shiny bright thing in databases. This book appears to be a good read, I’ve downloaded it to my Kindle and started it. So far it is very interesting and easy to read. If you don’t have any background in databases I would suggest reading an entry level book first. This review by I Programmer is a much better review than I would ever write, give it a quick2-minutee read. We are truly in the “next Generation” of database evolution, you need to pay close attention to what going on right now, or be stuck in the dBase/Paradox past again.
Tableau is one of the better Business Intelligence tools in the latest NEW BI tools being released that I call BI-Fresh. It works with many different data sources including some of the larger cloud based databases. While my experience is limited to Teradata data warehouse with Tableau I would recommend it for your short list for BI-Fresh evaluations. Here is a recent review done by ComputerWorld on the latest release 9.0 Beta.
Teradata, SQL Server, SQLite, MySQL, Oracle, Excel, Access and dBase are databases you may use on a regular basis. If you do you probably have 3-6 Database tools to help manage them all. Each of these has their own user interface and capabilities. Database .NET from Fish Code Library is the one tool to help you manage all these databases and many more. Database .NET (DBN) is a single EXE file and can be installed as a portable product on a thumb drive or a shared product on Dropbox.
Database .NET is not the best name Fish Code Library could have picked as it makes it difficult to locate it on the web without having to look at a 100 posts about Microsoft’s .NET foundation.
DBN has so many features that DBA’s and DB Developers would like.
- Can be run on Corporate computers that don’t have install rights, you folks know who I mean
- Can be run from a USB Thumb drive
- Can be run from Dropbox and one instance shared across multiple computers
|Free & Premium
- Free version does 80% of what many users need
- Premium version is $30…. Really!!
That’s less than the cost of a monthly cup of coffee (Cheap coffee at that)
- Very small footprint
- No Windows installation required
- No extra DLL’s that might cause conflicts on your system
- Updates are simple, just replace the EXE file
- Overhead is very low, open multiple copies of DBN to work between different servers and database types
- Fast processing
|.NET Data Providers
- DBN has all the latest .NET drivers embedded in the tool
- No ODBC Data Source Names to maintain outside the tool
- Configuration for DB connections maintained by DBN
- Teradata supported with Full capabilities of .NET 15.x.x
- 21 databases servers supported
- Legacy support for dBase and Paradox, anyone remember those?
- Comma Separated Values (CSV) very common format
- Tab Separated Values (TSV) gotta love Unix support
- XML data exchange with bloated tags
- Excel (XLS) Because if it isn’t a database it’s a spreadsheet
- Insert SQL portable between different database formats
- HTML ok the web isn’t a fad and is probably going to be here a while
- Java Script Object Notation (JSON) NoSQL document format
|DOS Command mode
- DBN be called in a Windows Batch file
- Inexpensive and casual ETL tool and data transfer tool
(GeekMustHave will have an example of Teradata to SQLite Windows Batch file soon.)
- Database to Excel refresh tool with Windows shortcuts
- Basic SQL to JSON convertor for relational databases
- Version tracking on selected SQL scripts
- Simple code library for each database/project in Windows directories and files
- SQL code can be quickly saved in Windows files in a structured directory
- Copy all your SQL scripts for all you projects by copy and pasting a single directory
- Install DBN on Dropbox or other Cloud services and have you DB tool and script everywhere you go.
- Most Excellent
- Fish Code Library also has a number of other productivity tools
- Forum available to post improvements and vote on them
- Many of the improvement requests show up in the frequent updates
- Have I mentioned Teradata support
To be a fair reviewer of database tools you do have to mention some areas where the tool could be improved
- Allows for the number of rows returned in the result to be automatically limited without changing the SQL. Good for large database queries and data warehouses usage.
- DBM is a Windows tool, get over it or get a copy of VM Fusion and run it there
- That being said, a native MAC version would be nice
People who serve as Database Administrators often think of themselves as gods. People like me who have to work with Teradata, Oracle, SQL Server. DB2, MySQL, Access and even dBase and Paradox need a single tool with the same user interface use WinSQL. The Lite edition has always been free. There is a developer edition for $100 and the professional edition for $250. Upgrade prices are a bit less.
Usually to move data from one system to another you have to know the tools to unload the data on one side and the tools to reload the data on the other side. If you only use one database like SQL Server then it’s not big problem. However moving data say from Oracle to Teradata is a bit tougher. DataBags to the rescue. Bag up the data on the Oracle side then unbag it on the Teradata side all done with just one tool WinSQL. You can even schedule the Databags to run manages by WinSQL.
WinSQL also handle test data generation, reverse engineering the database layouts and doing backups and restores.