Other Data Workflows#
There are lots of different ways to build on the relational database concepts we’ve covered.
Other Relational Database Workflows#
Examples of relational database platforms include MySQL
, Oracle
, MariaDB
, and MongoDB
.
Cloud Database Workflows#
In most enterprise systems, database access is managed through cloud or web-based systems.
Examples include Microsoft Azure
or Google Cloud
.
For example, Notre Dame’s data repository (DataND
) uses Amazon Web Services for storage and provides user access viaAWS Snowball.
NoSQL Workflows#
Non-SQL (NoSQL) databases are a robust data storage option that is not based on table relationships. Examples of NoSQL workflows can include a variety of different data structures.
Graph Object#
For example, a graph (or graph object) database uses nodes and relationships to store data.
Additional Resources#
For more on SQL versus NoSQL workflows:
“SQL vs. NoSQL Database: When to Use, How to Choose” ML4Devs blog post (7 July 2021)
Alternate File Formats#
Especially when working with large datasets, .csv
files can be prohibitely resource intensive for processing, runtime, memory, etc.
Alternate data structures that incorporate column-based (rather than row-based) data storage have emerged as alternatives.
Specific file formats include Parquet
and Feather
. We will not be covering these workflows in depth, but some resources that can get folks started:
Other resources:
tomaztsql, “Comparing performances of CSV to RDS, Parquet, and Feather file formats in R” R-bloggers (8 May 2022)
Databricks, “Parquet” Databricks.com
Hadley Wickham, “Feather: A Fast On-Disk Format for Data Frames for R and Python, powered by Apache Arrow” Posit blog (29 May 2016)
Feather website/documentation