Skip to content

Commit

Permalink
Merge pull request #10 from DAboaba/dev
Browse files Browse the repository at this point in the history
Dev
  • Loading branch information
DAboaba authored Feb 7, 2024
2 parents 0e61377 + acc3058 commit 1436313
Show file tree
Hide file tree
Showing 2 changed files with 8 additions and 10 deletions.
12 changes: 5 additions & 7 deletions posts/analysis_engineering/index.qmd
Original file line number Diff line number Diff line change
Expand Up @@ -22,25 +22,23 @@ My simpler explanation leverages that lab-assistant analogy to help people under

### Naive Assumptions and Unexpected Problems

I originally came in thinking that I would do exactly what I had done for my MSc thesis - write code to analyze data, the end 😂. It didn’t take long for me to realize how naive an idea this was. But that realization didn’t necessarily come about because someone showed me the right way to do things. Instead, it came about because the nature of my work was different enough from the kinds of individual one-off analysis I had done in the past and was very familiar with that I quickly ran into problems that my existing practices, techniques, and tools couldn’t solve.
I originally came in thinking that I would do exactly what I had done for my MSc thesis - write code to analyze data, rinse, repeat, the end 😂. It didn’t take long for me to realize how naive this was. But that realization didn’t solely (or even, primarily) come through someone "showing" me the right way to do things. Instead, it came about because the nature of my work was different enough from the kinds of individual one-off analyses I had done in the past (and was very familiar with) that I quickly ran into problems that my existing practices, techniques, and tools couldn’t solve.

These problems (problems of collaboration, maintainability, scalability, reproducibility, replicability, and efficiency) often kept me awake late at night wondering if the work I had done was actually good enough. Crucially, I almost never worried about whether my analysis was good enough. Once I figured out how to implement a given research agenda, the analysis (the bit I’d been trained to do) was often trivially easy. It was everything else - things that more senior researchers didn’t really think about, hadn’t been trained to think about, and I was quickly realizing I was being paid (or more accurately underpaid 🫢) to think about - that worried me.
These problems (problems of collaboration, maintainability, scalability, reproducibility, replicability, and efficiency) often kept me awake late at night wondering if the work I had done was good enough. Crucially, I almost never had the same concerns about my analysis. Once I figured out how to implement a given research agenda, the analysis (the bit I’d been trained to do) was often trivially easy. It was everything else - things that more senior researchers didn’t really think about, hadn’t been trained to think about, and I was quickly realizing I was being paid (or maybe underpaid?! 🤔) to think about - that worried me.

So how did I solve these problems? Naturally, I did what I was told to do when I first began programming - I googled, I read, and I borrowed very very liberally from other people’s code. Over time, as I began to find solutions to these problems[^2], the way I approached my work fundamentally changed. I stopped thinking about analysis as a one-off exploration that just happens to be done with code and started thinking about results **and** code as products in and of themselves.

[^2]: I’m definitely considering making future posts that talk about these solutions - look for them under the "analysis-engineering" tag

### Software engineering by another name

I began to realize that my actual job was creating data products (analytical engines calibrated to answer specific question)[^3] for more senior researchers (and other stakeholders) using code. The more I approached my work this way, the more I realized that software engineers had encountered and solved (or mitigated) many of the problems I thought were novel, the more I adopted best practices from software engineering, and the more I began to realize that I was actually doing was ... software engineering. Put differently, I saw that I used the same tools as engineers, and found that adopting similar practices as engineers solved the problems I was encountering, so there was a high chance that my job had been horribly miscast. I was called a research analyst, but what I had been doing this whole time was a version of software engineering.
I began to realize that my actual job was creating data products (analytical engines calibrated to answer specific question)[^3] for more senior researchers (and other stakeholders) using code. The more I approached my work this way, the more I realized that software engineers had encountered and solved (or mitigated) many of the problems I thought were novel, the more I adopted best practices from software engineering, and the more I began to realize that I was actually doing was ... software engineering. Put differently, I saw that I used the same tools as engineers, and found that adopting similar practices as engineers solved the problems I was encountering, so there was a high chance that my job had been horribly miscast. I was called a research analyst, but what I had been doing this whole time was a version of software engineering.[^4]

[^3]: [`Emily Reiderer`{=html}](https://emilyriederer.netlify.app/about)` used and defined this term in a way that firmly lodged it in my conscious unconscious`{=html}

To be sure, there was something "unique" about this small kingdom of engineering. Effectively, it’s leading citizens had convinced themselves they lived in a completely different kind of kingdom so, to the detriment of less powerful citizens as well as the kingdom as a whole, they saw no value in and consequently refused to adopt and promote the practices/policies that similar, more clear-eyed kingdoms had developed and used to their advantage. Fortunately, less powerful citizens (like myself) who had begun to see just how similar our kingdom is to other (much cooler) kingdoms realized that, as long as we continued to deliver the requested results (because our betters worshiped the final results table as the greatest and most holy totem), we could secretly continue working on what really mattered - approaching the entire analysis as the software engineering exercise it was.[^4]
[^4]: Don’t get me started on how, to the detriment of younger researchers and research as a whole, a lot of senior researchers still don’t see this. Truly, don’t get me started - in an earlier draft I had a whole paragraph that doubled as a very critical, very “clever” analogy, but my editor (read wife) convinced me that it was a little too on the nose 🤷🏾‍♂️.

[^4]: Yes! This is me working out my frustrations through a "cleverly" disguised story 😉

In a word, I realized that my work is actually analysis engineering (applying statistical understanding **AND** the engineering design process to design, develop, test, maintain, and evaluate analysis code/software[^5]) and approaching it as such has made my work life much much easier. Unsurprisingly, this isn't an oiginal idea. In fact, several other people have been thinking and writing about this same idea for a while. For instance, the analysis-engineering approach is a core building block of Emily Riederer’s [RMarkdown Driven Development workflow](https://emilyriederer.netlify.app/post/rmarkdown-driven-development/). Similarly, in a fantastic [PeerJ preprint](https://peerj.com/preprints/3210/) that is absolutely worth a full read[^6], [Hilary Parker](https://hilaryparker.com) argues that, given its complexity and likelihood of error, modern analysis engineering has come to a point where encouraging (and teaching) the use of methods and tooling that push users to adopt recognized best-practices (particularly those that guard against error during the technical creation of an analysis) is not just possible but actually necessary so that analysts can more easily and naturally create work that is reproducible, accurate, and collaborative. By so doing, analysts will be freed up to focus on (and be more productive in) the actual analysis rather than focusing on not making common, avoidable, time-consuming mistakes.
In a word, I realized that my work is actually analysis engineering (applying statistical understanding **AND** the engineering design process to design, develop, test, maintain, and evaluate analysis code/software[^5]) and approaching it as such has made my work life much much easier. Unsurprisingly, this isn't an original idea. In fact, several other people have been thinking and writing about this same idea for a while. For instance, the analysis-engineering approach is a core building block of Emily Riederer’s [RMarkdown Driven Development workflow](https://emilyriederer.netlify.app/post/rmarkdown-driven-development/). Similarly, in a fantastic [PeerJ preprint](https://peerj.com/preprints/3210/) that is absolutely worth a full read[^6], [Hilary Parker](https://hilaryparker.com) argues that, given its complexity and likelihood of error, modern analysis engineering has come to a point where encouraging (and teaching) the use of methods and tooling that push users to adopt recognized best-practices (particularly those that guard against error during the technical creation of an analysis) is not just possible but actually necessary so that analysts can more easily and naturally create work that is reproducible, accurate, and collaborative. By so doing, analysts will be freed up to focus on (and be more productive in) the actual analysis rather than focusing on not making common, avoidable, time-consuming mistakes.

[^5]: Borrowed and modified very liberally from Google's definition of software engineering

Expand Down
6 changes: 3 additions & 3 deletions posts/welcome/index.qmd
Original file line number Diff line number Diff line change
@@ -1,15 +1,15 @@
---
title: "Straight Outta Brooklyn"
subtitle: "Will blog for money ... or a better job - ruminations on why this exists"
subtitle: "Will blog for money ... ruminations on why this exists"
image: nwa.jpg
date: "2024-01-24"
date: "2024-01-23"
categories: [news]
citation: false
---

I decided to make a blog!

For quite a while I’ve been feeling stuck in my career and unsure about how to demonstrate and build my data-science skills. All the advice I’ve heard (and read) can be boiled down into variations of “do a personal project!!”. [Macklin Flusher](https://medium.com/@macklinfluehr) has this [fantastic article](https://medium.com/@macklinfluehr/the-modern-job-search-hasnt-gotten-harder-47148b6bd479) about modern job searching. The article is worth a full read, but the TLDR is this:
For quite a while I’ve been feeling stuck in my career and unsure about how to demonstrate and build my data-science skills. All the advice I’ve heard (and read) can be boiled down into variations of “do a personal project!!”. [Macklin Flusher](https://medium.com/@macklinfluehr) has this [fantastic article](https://medium.com/@macklinfluehr/the-modern-job-search-hasnt-gotten-harder-47148b6bd479) about modern job searching. The article is worth a full read, but here's the TLDR:

::: {.callout-note appearance="minimal" icon="false"}
“The new networking is Project Work. If you want a job, you are going to have to show them you can do it”.
Expand Down

0 comments on commit 1436313

Please sign in to comment.