I've written a lot over the years. Top level stats: over 500,000 words, almost 1,400 posts, and 2,400 comments published internally at Automattic (excluding Slack, Basecamp, and Github). I cannot share all of these, but I will share some examples below.
Other than writing samples, I'd also like to emphasize that I've received a long list of kudos shared by my team members over the years. Kudos is our way of internally recognizing and appreciating coworkers in a public way.
Note: most of these are screenshots. You can enlarge them by clicking on them.
This is a little outdated, but it highlights how I worked back when I did support, including workflows and the apps I use(d). Still, it's a good example of how I'd write a pretty general post that's not directly work-related:
I also document my own learning journey, but most of that is in my personal notes files. Only very few make it to a personal blog I keep, usually those that provide official credentials/certifications. For example, here are all the blog posts I published while learning SQL/Python for my data work at Automattic:
Here's one example of a project thread I can safely share. It follows a similar communication process to that of 37signals, where updated on progress are expected, where division leads can follow the project thread and receives notifications for every update:
We use a special way of writing SQL internally, which often means changing variables into absolute values when testing locally (as opposed to publishing the end result in Github). This can be very frustrating, and I used an Apple Script to solve this problem, and then shared it with the data division:
This is an example of writing documentation for a process that wasn't previously defined, but had to be to ensure consistency among data reports published by the various data teams:
This is an example of where I've gone through an in-depth SQL analysis on Tumblr, only to learn that the information is stored elsewhere. Tumblr has gone through 4 acquisitions, and the data is quite messy. Thankfully, someone knew where to look, and it gave me an opportunity to complete the analysis accurately (while still keeping the wrong approach so it doesn't happen again).
After receiving a data request, we realized that the team is struggling with understanding retention rates. This is a follow up post that I've shared, using the specific experiment as an example, to help explain the process to everyone else in Tumblr.
timespent
MetricAfter working on various analyses, I noticed that the way we define timespent
in Tumblr is not consistent. This has turned into its own task that I've tackled to help clarify this confusion for everyone. It required digging through a lot of code (primarily Python with embedded SQL) to isolate the exact definitions.
Finding the right tables has become very tricky, since a lot of the tables had similar names, and were impossible to find the correct one.
Diving deeper into this resulted in clearing 17 terabytes of unnecessary data.
This was a pretty big project I've worked on, reviewing all of Tumblr's KPI charts and replacing them with more modern tools, specifically Superset and Looker (or both). Behind the scenes, this required rebuilding the dashboards in both services, using LookML and custom SQL queries, and making sure the data pipeline is updated via Airflow and updated to our Python code.
This is a summary of an issue that was quite long lasting (almost 4 months) and required the full support of Tumblr Analytics as well as our Core Engineering team to tackle. This post aimed to capture the process and the learnings along the way.
Data at Automattic often uses various user-defined functions but these were not documented well, and therefore, made onboarding into the team very confusing. I've taken it upon myself to learn, understand, and document the various UDFs that existed to make it easier for everyone else. It's a good example of how I'd approach documentation.
I've also authored a few posts on our public facing blog, data.blog. Two in particular are relevant:
I also published a couple on my own blog post, though, not nearly as often as I'd like to. For some reason, I find it easy to write for work, but much harder to write for myself. Still, there's one post in particular that I'm pretty happy with: Write to help yourself, publish to help others.
This one is also a good one, though, it's mostly a collection of questions I'd occasionally look at when preparing for 1-1s back when I was a team lead: Questions for 1-1s and teams – a primer for remote communication.
I've pitched Shape Up within Automattic (even though I wasn't a developer), which has since been picked up by multiple teams. I believe in this approach and recently even published about how Shape Up can be used for personal life, not just work. I wrote about this in this post: What If We Used Shape Up in Our Personal Lives?. It was more of a thought exercise at the time.
Now that I'm applying for this junior programmer role, I'm extra sad that I don't have more public resources to pull from!
I've been an organizer in WordCamp Asia 2025. As part of my work, I published a couple of blog posts:
While I don't have many decent PRs to share from Open Source contributions, I will share some example PRs from work I've done internally at Automattic that do not reveal any personal information.
office_name
a8c_devex_trial_status_decided
transformationThere are many notebooks that I've created using Jupyter Notebook to run our analyses. However, they are quite impossible to share due to their length, so I'm only including one.
Pull request | Files changed: (Part 1); (Part 2); (Part 3); (Part 4)
There were many of them, but I'll include only a few.