Press "Enter" to skip to content

Category: Analytics

Using Natural Language Processing (AWS Comprehend) to Analyze Text (Cardi B Lyrics)

Humans spend a lot of time reading, analyzing, and responding through text (emails, chats, etc). A lot of this is inefficient or not for pleasure (such as the amount of payroll companies spend to read through feedback emails or the amount of time I spend sifting through Outlook each day). Using Natural Language Processing (NLP), we can reduce the inefficient and not-for-pleasure reading we do so that time can be re-invested into something more productive or fulfilling. For fun, I scrapily ran the lyrics to Cardi B’s “I Like It” through AWS Comprehend to see what its response would be. I also ran a review of Mission Impossible: Fallout through the same service.  The full output for Cardi B can be viewed here.  The full output for Mission Impossible: Fallout can be viewed here. While these are low value examples, a more real-world use case for Comprehend could be using AWS Comprehend to detect the language of emails sent to your company to adjust the routing destination in real-time (should they go to you English team or your Spanish?). Another example would be using Comprehend to collect feedback on your new product launch or ad campaign. For example, we could easily capture…

Using AWS Lambda to Send SNS Topics in CloudWatch

AWS Lambda enables you to run code without managing a server.  You simply plop in your code and it does the rest (no maintenance, scaling concerns, etc).  The cost is only $0.20 per 1 million requests/month and the first million requests are free each month. In the previous post, I setup an SNS topic. I’m extending this further so that a node.js function will be triggered in AWS Lambda each time my SNS topic is triggered. This Lambda function will feed metrics into AWS CloudWatch which will allow me to chart/monitor/set alarms against events or patterns with my SNS topic.  A practical use case for this could be understanding event patterns or logging SNS messages (and their contents) sent to your customers. Creating your Lambda Function From the Lambda page of the AWS console, select “Create Function”.  From here, we’ll author from scratch.  Below are the inputs I’ve used for this example:Name: SNSPingerToCloudWatchRuntime: Node.js 8.10Role: Choose and existing roleExisting role: lambda_basic_execution On the page after selecting “Create Function”, we’ll click “SNS” from the “Add Triggers” section and then select our SNS topic in the “Configure Triggers” section.  Then click “Add” and “Save”.  Here’s a screenshot of the final state. Next, click on your function…

Indexing my movie collection

#FirstWorldProblems – having so many DVDs that you forget what you already own and end up buying multiple copies of the same movie.  While 126 movies isn’t a massive collection, it’s enough for me to sometimes forget what I have when I’m pillaging the $5 bins at Best Buy and Target. To solve for this, I created a Google Sheets list of my collection so I could check what I have from my phone.  After typing all the titles into the list, I realized it’d be very easy for me to use the code I wrote for my DirecTV project to scrape additional details for the movies and create a nice, simple UI….so I did: What it does Using The Movie DB API, I pull several pieces of information about the film and store it locally: title, image, release date, rating, budget, revenue, runtime, synopsis, genres, cast, etc. Storing it locally reduces repetitive, slow API calls and allows me to cleanly add additional attributes like whether it’s DVD, Blu-Ray, Google Movies, Amazon Video, etc. Adding new titles is easy – I just type in the name and the rest of the details populate immediately.   There are two views: one shown…

Logging router traffic and other changes

It’s been a slow couple months between the holidays, travelling, and work.  I did manage to accomplish a few things with the Home Dashboard project, though.  I redesigned the UI to move away from an exclusively mobile interface as the amount of data and type of data I’m including in the project now simply don’t all make sense to squeeze into a mobile UI.  Sometime in early January, the system broke the 2 millionth record milestone — I’m unsure what I’ll do with some of the data I’m collecting at this point but I’ve learned a lot through collecting it and I’m sure I’ll learn more analyzing it at some point in the future. This brings the list of events I’m collecting to: Indoor temperature and humidity Amazon Echo music events DirecTV program and DVR information Cell phone location and status details Local fire and police emergency events Home lights and other Wink hub events …and now home network information Analyzing home network information The biggest change was the addition of network event logging.  After seeing that a foreign IP was accessing my LAN, I started logging each request to or from my home network until I was sure I had fixed…

Collecting and Handling 911 Event Data

Seattle has a pretty awesome approach to data availability and transparency through data.Seattle.gov.  The city has thousands of data sets available (from in-car police video records to land zoning to real-time emergency feeds) and Socrata, a Seattle-based company, has worked with the city (and many other cities) to allow developers to engage this data however they like.  I spent some time playing around with some of the data sets and decided it’d be nice to know when police and fire events occurred near my apartment. I setup a script to pull the fire and police calls for events occurring within 500 meters of my apartment and started storing them into a local database (Socrata makes it so simple – amazing work by that team).  While reading it from the API, I check the proximity of the event to my address and also the type of event (burglary, suspicious person, traffic stop, etc) and trigger emails for the ones I really want to know about (such as a near by rape, burglary, shooting, vehicle theft, etc).  I decided to store all events, even traffic stops, just because.  I may find a use for it later – who knows… After I’ve scrubbed through and sent…

Data Visualization and Demo

As mentioned previously, my goal wasn’t to just create a home controller/dashboard but to also collect as much data as possible while doing so.  So tonight, I started playing around with a few different visualizations of the data I’ve collected thus far.  It took a few hours but I’m satisfied with the current state. I’m doing simple dumps of the most recent music played by my Amazon Echo; most recent programming watched via DirecTv; visualizing the daily average, minimum, and maximum temperature and humidity levels in my apartment; visualizing by hour of day the average, min, and max temperature for the current month vs the previous month; breaking down the amount of time I spend at home by day of week (and telling on myself that I like to leave work early on Fridays :)); and visualizing my TV watching habits by hour of day and day of week. I recorded a video of this all and also included the DirecTv control demo at the end.

Using a Pi to measure TV habits

As noted here, I’m using the the DirecTV SHEF API and a Raspberry Pi to poll my DirecTV receivers every minute and store what they’re doing into a MySQL database. After ~6 months of storing that data, I thought it’d be interesting to analyze some of it.  After all, do I really watch enough HBO and Starz to warrant paying for them every month? Channel Preferences 25,900 minutes of TV watched (~2.25 hours per day…eek; still less than the national average!). 2,612 minutes of that was recorded (10%). NBC is our favorite channel (3,869 minutes watched, 15%). E! is our second favorite channel (1,911 minutes watched, 7.37%). Gotta keep up with those Kardashians. Premium movie channels (HBO, Starz, Encore, Cinemax, etc) were watched 6,870 minutes (26.52%) – apparently worth the money. Premium movie channels were recorded only 571 minutes (lots of ad hoc movie watching, I guess). NBC is our most recorded channel (479 minutes) followed by HGTV (391 minutes) and ABC (330). Time Habits Sunday is the most watched day (no surprise here) with 7,157 minutes watched (28%) Saturday is the second with 5,385 (21%) Wednesday is the least watch with 1,144 (4.4%) April was our biggest TV month with 6,413 minutes…