Data Warehouses and the Flying Car Dilemma — DZone Big Data Zone

Image result for flying car images

Data Warehouses and the Flying Car Dilemma — DZone Big Data ZoneTraditional data warehouses and databases were built for workloads that manifested 20 years ago. They are sufficient for what they were built to do, but these systems are struggling to meet the demands of modern business with the volume, velocity, and user demand of data. IT departments are being challenged from both ends. On one…

You can’t just paste a set of wings on your Toyota and expect to fly to your next appointment.  You can just tack on some new technologies to legacy data warehouses and expect to provide the insights to make your company survive and prosper.  Just as well you can’t just throw out the baby with the bath water.  The data warehouse probably cost quite a bit of money and effort and is embedded in the company process.

Twenty plus years ago I helped to introduce data warehousing to some very large government and corporate customers.  When I look back some of them. they are still using the same tools and processes after 20 years.  Would you continue to drive a 20-year-old car just because you paid too much for it?

This article has some great insights as to some of the alternatives.  Augment some and replace some.  The diagrams of traditional and alternative data warehouses are keepers.

via Data Warehouses and the Flying Car Dilemma — DZone Big Data Zone

Is the traditional data warehouse dead?

 

RIP.jpg

I think the ultimate question is: Can all the benefits of a traditional relational data warehouse be implemented inside of a Hadoop data lake with interactive querying via Hive LLAP or Spark SQL, or should I use both a data lake and a relational data warehouse in my big data solution?  The short answer is you should use both.  The rest of this post will dig into the reasons why.

The love affair with the noSQL (BigData) databases seems to be over.  Many of the projects using Hadoop and the other “not” relational databases have fallen by the wayside.  Some things like structured data are still done better on the old school relational database server s and accessed with SQL or some SQL tool.  As the amount of unstructured data increases so will the use of noSQL databases.

Via: Is the traditional data warehouse dead?

The Power BI Gateway; All You Need to Know

Power BI is a data analysis tool that connects to many data sources. If the data source for Power BI is located in an on-premises location, then the connection from cloud-based Power BI service, and on-premises located data source should be created with an application called Gateway. In this post, you will learn what the Gateway is, what are types of the gateway, their differences, installing the gateway, and scheduling a data set with that gateway.

If you are using a data warehouse like Teradata, Netezza, Oracle, etc then using Power-BI with the BI-Gateway may be the best option.  The Microsoft BI-Gateway can be installed as a “Personal” gateway on your desktop or laptop.  This gives the data analytical folks the ability to develop and test BI reports and data visualizations without transferring large amounts a data about.  In a production environment, the gateway would probably be installed on a separate server to help distribute the workload.  The centralized server also appeals to the “Enterprise” and “Security” minded folks.

Here is the link to the Microsoft Power-BI Gateway.

Via: The Power BI Gateway; All You Need to Know

 

Presto, Magico open source distributed SQL Engine

2017-04-25_11-23-16In today’s blog, I will be introducing you to a new open-source distributed SQL query engine, Presto. It is designed for running SQL queries over Big Data (petabytes of data). It was designed by the people at Facebook. Quoting its formal definition:

“Presto is an open-source distributed SQL query engine for running interactive analytic queries against data sources of all sizes ranging from gigabytes to petabytes.”

Th folks at Facebook are at it again.   They build a SQL engine especially for analytical work, this is not an online transaction processing (OLTP) engine.  It’s an engine for ad-hoc queries across SQL/NoSQL databases distributed all over the place.

They use connectors for MySQL, Hadoop/Hive, MongoDB, Postgres and more.  Missing are some of standards like Microsoft SQL and Teradata.  However, this won’t be the story for long.

Presto is in its open source newness but you should take a look at the documentation to really appreciate the power of this new thing.

via An Introduction to Presto — DZone Big Data Zone

Teradata Command Line Tool for Linux

TDSQL.jpg

Been looking for a simple Command line (Terminal) tool for Teradata SQL?  Look no further, this is an option, this article shows the TDSQL tool in use and has the GITHUB Link to it.  It supports Standard Input and Standard Output; it can be used in scripting languages like Bash to automate some tasks.  TDSQL is written in Perl and it also relies on the DBD::Teradata module available here.  There is a little Linux voodoo required here but it is manageable.

 

Business Intelligence – Tableau 9.0 beta rolls out

tableau_preview_new_dragndrop_advancedanalytics-100572699-primary.idge

Tableau is one of the better Business Intelligence tools in the latest NEW BI tools being released that I call BI-Fresh.  It works with many different data sources including some of the larger cloud based databases.  While my experience is limited to Teradata data warehouse with Tableau  I would recommend it for your short list for BI-Fresh evaluations.   Here is a recent review done by ComputerWorld on the latest release 9.0 Beta.

Database .NET from Fish, a Swiss Army Knife tool kit for Database developers

image Teradata, SQL Server, SQLite, MySQL, Oracle, Excel, Access and dBase are databases you may use on a regular basis.  If you do you probably have 3-6 Database tools to help manage them all.  Each of these has their own user interface and capabilities.  Database .NET from Fish Code Library is the one tool to help you manage all these databases and many more.   Database .NET (DBN) is a single EXE file and can be installed as a portable product on a thumb drive or a shared product on Dropbox.

Database .NET is not the best name Fish Code Library could have picked as it makes it difficult to locate it on the web without having to look at a 100 posts about Microsoft’s .NET foundation.

DBN has so many features that DBA’s and DB Developers would like.

Features

Portable
  • Can be run on Corporate computers that don’t have install rights, you folks know who I mean
  • Can be run from a USB Thumb drive
  • Can be run from Dropbox and one instance shared across multiple computers
Free & Premium
  • Free version does 80% of what many users need
  • Premium version is $30…. Really!!  
    That’s less than the cost of a monthly cup of coffee (Cheap coffee at that)
Single EXE
  • Very small footprint
  • No Windows installation required
  • No extra DLL’s that might cause conflicts on your system
  • Updates are simple, just replace the EXE file
  • Overhead is very low, open multiple copies of DBN to work between different servers and database types
  • Fast processing
.NET Data Providers
  • DBN has all the latest .NET drivers embedded in the tool
  • No ODBC Data Source Names to maintain outside the tool
  • Configuration for DB connections maintained by DBN
Teradata
  • Teradata supported with Full capabilities of .NET 15.x.x
Server support
  • 21 databases servers supported
  • Legacy support for dBase and Paradox, anyone remember those?
Export Formats
  • Comma Separated Values (CSV) very common format
  • Tab Separated Values (TSV) gotta love Unix support
  • XML data exchange with bloated tags
  • Excel (XLS) Because if it isn’t a database it’s a spreadsheet
  • Insert SQL portable between different database formats
  • HTML ok the web isn’t a fad and is probably going to be here a while
  • Java Script Object Notation (JSON) NoSQL document format
DOS Command mode
  • DBN  be called in a Windows Batch file
  • Inexpensive and casual ETL tool and data transfer tool
    (GeekMustHave will have an example of Teradata to SQLite Windows Batch file soon.)
  • Database to Excel refresh tool with Windows shortcuts
  • Basic SQL to JSON convertor for relational databases
Version Tracking
  • Version tracking on selected SQL scripts
Code Libraries
  • Simple code library for each database/project in Windows directories and files
  • SQL code can be quickly saved in Windows files in a structured directory
  • Copy all your SQL scripts for all you projects by copy and pasting a single directory
  • Install DBN on Dropbox or other Cloud services and have you DB tool and script everywhere you go.
Support
  • Most Excellent
  • Fish Code Library also has a number of other productivity tools
  • Forum available to post improvements and vote on them
  • Many of the improvement requests show up in the frequent updates
  • Have I mentioned Teradata support

To be a fair reviewer of database tools you do have to mention some areas where the tool could be improved

Improvements

Results Limiter
  • Allows for the number of rows returned in the result to be automatically limited without changing the SQL.  Good for large database queries  and data warehouses usage.
MAC support
  • DBM is a Windows tool, get over it or get a copy of VM Fusion and run it there
  • That being said, a native MAC version would be nice

Teradata Studio, Windows 7, Java 64 Bit runtime

Those who install Teradata Studio on a 64 Bit Version of Windows 7 will need to have to correct version of Java installed.  There are 32 bit and 64 bit versions of Java if you have the incorrect one installed you will get the following error message.

image

If you have installed Java using a browser there is a fairly good chance that you have the 32 bit version installed, not the 64 bit version.

The 64 bit Java JRE version can be found at this link.  After you install this Java version a reboot will be needed in order for the Teradata installer to find the Java path.  As of this date the correct Java version is Version 8 Update 25.  This will change over time.

Teradata Studio – SQL Assistant Display User DB in initial SQL Source Explorer

imageI have been using SQLAssistant for over 20 years back when Teradata systems were called DBC1012’s.   Now I have been asked to put together a class to help developers using SQLAssistant to use Teradata Studio.  So I have just started using it and I’m finding it both fabulous and frustrating.

I have gone through some of the forums and found some very good tips.  There are some users who say Teradata Studio is the future of Teradata developed tools combining SQL Query, Teradata admin and data movement.  It supports large data implementations and JSON.    There are some other Teradata users who claim they will never stop using SQL Assistant even if Teradata decides to stop maintaining it.

One of the items developers may be used to is when they login to SQL Assistant they get the Explorer Panel displaying their home database.  If my user id is SchusterJPWC then I have a database area called SchusterJPWC.  Some Teradata installations give this home or user database some perm space to create work tables or views for projects.  

The default setting for Teradata Studio will not do this unless your user account is owned by the DBC database.  Most Teradata installations don’t put User databases directly under the DBC root.  There is usually some hierarchy and the user database is down 2 or 3 levels.  This means your User database or Home Database will not appear in the SQL Source Explorer window.

You can fix this by going to Windows-> Preferences-> and make the following change,

image

You would think if you checked (All databases and Users) it would include your own user database, it doesn’t if it’s not under the DBA root.  The research Teradata system we use does have all the user databases under the DBC root it (All databases and users) will display my user database.

Determine SQL Server version with a query

I’m working on converting the Adventure Works 2008 database from SQL Server into Teradata so there is a database to experiment with.  Sometimes there is an issue were the SQL Server MDF files for Adventure Works didn’t match the version of the SQL Server instance.   How do you quickly tell all the details of the version, compatibility and other goodies from your server?

Here is a SQL Server query to uncover this information.

This will return a result that looks like this

SNAGHTML1c97f636

image

The Internal Version Number is important, here is a short list of the Major SQL Versions

  • SQL Server Version Compatibility Level Internal Database Version Build
  • SQL Server 2012 RTM 110 706 11.0.2100.60
  • SQL Server 2008 R2 100 665 10.50.xxxx   <—- This matches
  • SQL Server 2008 100 661 10.00.xxxx
  • SQL Server 2005  SP2 90 612 9.00.xxxx
  • SQL Server 2005 90 611 9.00.xxxx
  • SQL Server 2000 80 539 8.00.xxx(x)