Adding Real-time Functionality for Easy Data Analysis

Our Client

Odyssey Analytics client is one of the largest independent power producers in the United States. Their retail electricity business is exclusively focused on delivering electricity products and services for commercial and industrial customers. They currently serve business customers in Delaware, Maryland, New Jersey, Ohio, Pennsylvania & Washington, D.C.

Challenge

They had massive real-time data, files kept on external drives & 3rd party data with manual data management techniques which were increasing their expenses. Plus, they are required to ingest energy sector real-time data into data lakes for easy access and quick analysis.

Solution

Odyssey Analytics proposed using AWS Cloud Solution for managing the real-time data, file drops, data fetched through APIs and 3rd party data gathered from multiple data sources and ingest it into the data lake.
Through multiple lambdas, this data was taken from the s3 buckets to the processing layer where it was cleaned and processed by performing ETL operations for real-time and batch processing. Data Warehouse and Data Marts were also created in the processing layer. Time-series data was stored in AWS DynamoDB while the rest of the data was stored in AWS Redshift. Real-time data was reported directly from the processing layer by Tableau and AWS Athena for ad-hoc analysis. AWS CloudWatch was configured and used to monitor and provide reliable log records. This complete architecture was designed, configured, and executed by Odyssey Analytics.

Result

After replacing the manual data management methods, our client witnessed some key upgrades in their operations such as:

  • Auto upscale/downscale the usage as per capacity requirements.
  • Reliable log records and operation monitoring via AWS CloudWatch.
  • Effective usage of Lambda triggering & API gateways.
  • Automated data exporting, cleaning, and loading as required.

Technology Used

  • AWS Kinesis Data Streams
  • AWS SQS/SNS
  • AWS API Gateway
  • AWS Lambda
  • S3 Bucket
  • Amazon Redshift
  • AWS DynamoDB
  • AWS Cloudwatch