I’m Ben, and welcome to Sanity Check. The newsletter for tips, stories, doodles, and even some questions about working in the data field. Glad you’re here.
QQ: QUICK QUOTES
Here’s what’s new with me:
🦞 I traveled to Maine this week for my cousin’s wedding. Cheers to cool weather, room-temp lobster rolls, and warm hearts!
✏️ Due to the short work week & travel, the weekly Sanity Check is getting to you late. I hope it still finds you well!
👨🏻🏫 Fun Fact: My dad’s PhD research was in signal error detection and correction. I used that as inspiration for the analogy I carried through this week’s article.
FRESH FEATURE
Analytics Antenna: Planning out a data architecture
In last week’s article, we got a feel for the people side of Squarely. Although it’s a two-man show, there are still several distinct domains where work is done. That work emits digital signals.
This week, we will shift our focus to the systems side of Squarely. We will plan out how to set up our analytics antenna to pick up those signals.
The plan covers 4 parts:
Sources - Taking inventory of the systems in use
Ingestion - Establishing a method for pulling the data from those systems
Processing - How we will shape the data to support an array of questions
Analysis - Turning the processed data into action for the business
Onto our sources!
Systems Surface Area (Sources)
As a new business, or more accurately, as a side project, we must keep Squarely’s operational systems as simple as possible.
There is only one critical system — Amazon.
Choosing Amazon Kindle Direct Publishing (KDP) outsourced loads of complexity to Amazon. My dad uploads a PDF and sets a price. Payment processing, shipping, returns, and even printing the book are all handled by Amazon.
With Amazon handling nearly all sales and operations, we had the bandwidth to set up more systems. Distribution may be free on the Internet, but discovery is not guaranteed. We need good marketing systems.
To support our marketing, discovery, and acquisition needs, we landed on
Vercel for website hosting
Fathom Analytics for website analytics
ConvertKit for email marketing
These are all the systems we have in production use today. There will be more systems to come - like a Postgres database for a web-based version of Squarely - but this is already enough to build our analytics stack on.
Signal Acquisition (Ingestion)
With our four source systems identified - Amazon, Vercel, ConvertKit, and Fathom Analytics - how are we going to get their data into a central location?
This is where ETL tools can help us acquire the signal given off by our systems. There is not one tool that supports each of the four source systems though. So let’s walk through them one-by-one.
Amazon
Amazon has the Selling Partner API (SP-API). This API provides all the order details you could need. There are two viable options.
Fivetran - Get as far as possible within the usage limits of the free plan
Meltano - Test out the community-contributed SP-API tap
I am leaning towards Fivetran because their connectors are typically high quality and it will be faster to get started.
Vercel
Vercel’s data will be helpful for cost controls. We don’t want a surprise bandwidth bill! At our current scale, this is not a burning concern.
Although this should sit on the back burner, I have been looking for a reason to kick the tires on Steampipe. Now seems as good a time as any, and Pedram’s mdsfest-opensource-mds repo should provide some good hints on how to go about working Steampipe into a pipeline.
ConvertKit
ConvertKit, our email marketing tool, cannot be put on the back burner. It is important to understand how our acquisition attempts are performing.
ConvertKit has an API, but there are not many out-of-the-box integrations with extraction tools. I only see one viable option for ingesting ConvertKit data - Meltano.
Fathom Analytics
Fathom is a tricky one. The company has its first API in public preview. So no ETL tool has a canned integration with Fathom’s API.
Since I will need to roll my own solution, I see three possibilities:
Abuse the dbt Python models as an API client
Author a Singer tap that can be used by Meltano
Someone chimes in with an actual good idea once I publish this piece
Abusing dbt would be my shortest path to the data, but the Singer tap would be a more meaningful contribution to the data community. With the help of CoPilot, I know I could get the Singer tap working, but the thought of strangers relying on it makes me a little uneasy.
It is a shame this data will be hard to pull out of Fathom. The path for retrieving Google Analytics data is well trod. Yet, that’s the premium we pay for privacy.
Signal Processing (Processing)
Now that we have a plan on how to receive the data from our systems, how will we turn that data into something useful?
If you’ve been following my work, you know the answer for transformation is dbt. But dbt always sits on top of another data platform - Snowflake, DataBricks, BigQuery, etc.
This data platform selection is where the Squarely architecture will differ from most. I plan to initially dump the ingested data into AWS S3 buckets and then process it with the help of DuckDB and MotherDuck.
The one area this selection comes up short is in access controls. Again, with Squarely's small size, the lack of granular access controls is an acceptable trade-off.
What we gain from this selection is an inexpensive way to get started and fast development cycles. It is quick and easy to get started due to all the hard work put into the dbt-duckdb adapter.
Impulse Analysis (Analysis)
Our data stack is almost complete. In signal processing, there is an analysis method that establishes a feedback loop. It’s known as Impulse Analysis. You nudge the input to the signal, the impulse, and then monitor how that nudge changes the signal response. This type of analysis helps establish what inputs can be changed to get a desired output response. We want to set up the same sort of feedback loop for the business.
These feedback loops are where analytics really earns its keep. Data initiatives fail when they focus too much on data modeling or tooling implementation but never tie that work back to helping the business.
There are four ways that analytics typically closes the loop:
A/B Testing - Experimentation is a useful tool for driving incremental improvements, especially within product or marketing initiatives
Operational Analytics - This type of work pushes modeled data back into source systems. Typically, this helps front-line employees have a more complete picture during a customer interaction.
Forecasting - Coming into Q4, it is almost annual planning season. Forecasting is an important tool for helping set budgets and growth goals for the company.
Dashboarding - Closing the loop with dashboards requires more soft skills than automation. When done well, BI can be the most influential for catching market shifts early or informing future strategic direction.
Each of these categories has its own set of tooling and knowledge base. I cannot bite off all of these at the start. So, I am just going to focus on dashboarding. The coordination costs between myself and dad are low!
I’ll be using Evidence as the BI tool. The dashboard will take the form of a weekly business review. From this weekly review, we will strategize what impulse we want to push through the business next.
Big Picture Architecture
We covered a lot of ground. I find it helpful to zoom back out to see where we landed.
If this architecture looks familiar - well, it’s what Jacob Matson would call the MDS-in-a-Box. It’s a great way to get started locally with minimal expense. Yet this stack also has a clear path to scale to production.
These diagrams and planning documents help establish a checklist of implementation work. Now to go do it!
PERCOLATING PONDERINGS
🚧 Caution: opinions still settling, and I’d love to hear yours
Good Guy Amazon?
If you follow the news, big tech companies are out of favor. Amazon is one of those tech companies that is easy to vilify.
They crush mom & pop businesses
They are growing into a monopoly
Alexa is an invasion of privacy
Last quarter, I’d tend to agree with these opinions - but seeing what my dad has done with Squarely on Amazon is shifting my opinion.
Dad self-published Squarely through Amazon’s Kindle Direct Publishing (KDP) program. He merely needed to upload a PDF and set a price to have his book available for sale. Amazon handles all the parts that would be huge barriers for anyone trying to produce a physical product. The book's payment processing, shipping, returns, and even printing are outsourced to Amazon.
When I learned how Amazon handled that last point - the physical printing of the books - it astounded me. Just like “made-to-order” restaurants don’t cook your burger until it’s rung up, Amazon does not print the book until the order is placed. There is no stockpile of Squarely books taking up precious inventory space. No, it’s created on the spot and still arrives in two days!
Let me show you the alternative to ensure this just-in-time printing capability is fully appreciated. In my first venture, TagaPet, we purchased thousands of dollars worth of our custom dog tags wholesale. All our money was tied up in inventory. If we could sell it all, we would be back in the black, but the longer the tags sat on our shelves, the closer we got to insolvency. We could not keep the dog tags moving fast enough, and the clock ran out. Just-in-time printing entirely avoids this sunk-cost inventory issue, and I can breathe a sigh of relief.
Trying to roll out these outsourced capabilities ourselves would be a huge drag! Some capabilities would be outright impossible this early in the business. Thanks to Amazon, we can instead focus on our core competencies - making puzzles & telling people about them.
My dad’s experience is not an isolated one. Since 2020, more than 50% of Amazon’s retail business revenue has come from third-party sellers. Those third-party sellers are other small businesses setting up shop under the Amazon umbrella. Shopify partnering with Amazon for a “Buy with Prime” integration will open the door for more small businesses to offload complexity to Amazon. The barriers for new businesses with physical goods are being lowered.
There are still valid concerns with Amazon. You can find them elsewhere. Here, I wanted to take a quick moment to appreciate how Amazon helps. They helped make my dad’s puzzle book become a reality. I’m sure they have done the same for thousands of others.
Thank you for reading.
Let’s keep it going. 💜
If you enjoyed this edition, would you mind giving the heart below a click?