Friday, June 20, 2025

Sacrifice of data- philosophical note

I spent today morning reading the nature of non parametric tests and came across the phrase ' data sacrifice '. The revlevant section was related to the Mann Whitney 'U' test. Data, in the realm of statistical studies, is often seen as the raw material, the objective input from which insights are extracted (1). Yet, beneath this seemingly neutral facade lies a profound and often overlooked philosophical dimension: the sacrifice of data. This "sacrifice" manifests in several ways, each raising significant ethical and epistemological questions.
Firstly, there is the sacrifice of individual detail for aggregate truth. Statistical methods, by their very nature, seek patterns, trends, and generalizations (2 for the rhtorics only). To achieve this, individual data points are often stripped of their unique context and reduced to mere numerical values. The rich tapestry of a person's lived experience, a specific event's nuances, or a particular observation's idiosyncrasies are subsumed into categories, averages, and distributions. What is gained is a broader understanding of a population or phenomenon; what is lost is the specific, the particular, the irreplicable. This raises questions about the very nature of truth in statistical inquiry: is it a truth of the many, attained by sacrificing the truth of the one?
Secondly, there is the sacrifice of completeness for manageability and focus. No statistical study can capture every single variable, every possible interaction, or every minute detail. Researchers, constrained by resources, time, and the very limits of human comprehension, must make deliberate choices about what data to collect and what to exclude. This exclusion is a form of sacrifice. Data deemed irrelevant, redundant, or too difficult to measure are left behind, potentially forever. While necessary for practical reasons, this act of selection shapes the conclusions that can be drawn. It highlights the inherent subjectivity in seemingly objective data collection, as the researcher's framework and assumptions dictate what is deemed "worthy" of inclusion.
Thirdly, the sacrifice of raw, unadulterated information for structured, measurable variables is a crucial aspect. Qualitative data, rich in narrative and depth, often undergoes a process of coding, categorization, and quantification to be amenable to statistical analysis (3 rhetorics only). This transformation, while enabling comparisons and calculations, inevitably involves a reduction of complexity. The full spectrum of meaning embedded in a verbatim response or an ethnographic observation can be lost when translated into a numerical scale or a limited set of categories. This sacrifice is a trade-off: precision in measurement is gained, but often at the expense of capturing the full richness and ambiguity of human experience or natural phenomena.
Finally, there's the more subtle, almost unnoticed, sacrifice of potential future insights. When data is collected for a specific purpose and analyzed through a particular lens, alternative interpretations or future research questions that might arise from that same data, but weren't initially considered, might be foreclosed. The way data is structured and stored, the variables chosen, and the initial hypotheses formed can inadvertently limit the scope of future inquiry. This "foreclosure" is a sacrifice of the unknown, of possibilities that might only become apparent with different theoretical frameworks or analytical tools.
In conclusion, the sacrifice of data during statistical studies is not a mere technical necessity but a profound philosophical act with ethical and epistemological implications. It underscores the constructed nature of statistical knowledge, the inherent trade-offs involved in abstracting from reality, and the powerful, yet often invisible, hand of the researcher in shaping the "truth" that emerges. Acknowledging these sacrifices invites a more critical and reflective engagement with statistical findings, reminding us that every number tells a story, but also that every story told by numbers has had to shed some of its original complexity to be heard.

Reference 
1. https://adata.pro/blog/the-difference-between-data-information-and-insight/?hl=en-US

2. https://www.trebas.com/news-and-blogs/blogs/how-do-data-analysts-discover-meaningful-patterns-in-data?hl=en-US

3. https://getthematic.com/insights/
qualitative-data-analysis/?hl=en-US

Thursday, June 19, 2025

The Yogurt Maker. Chapter 12 - notes from the code breaker.


A very routine need may often lead to path for a great discovery. Chapter 12 describes the accidental discovery of the first gene editing concept comming from an Yogurt  industry which had the ingredients of curiosity driven development. Rodolphe B and Philippe H are the stars in this chapter and for a while it was appearing as if Doundua would no longer be required. It was certainly am amazing piece of contribution on part of the two industry based scientist with an intention to make better yogurt producing bacteria. 
The author starts the chapter by introducing the idea of ' linear model of innovation ' and basic research. And eventually he prove the presence of exceptions. This bit of important contribution was a lateral stream , powered by an alternate purpose and located far from the academic mainstream. there was enough mention of the extent of engagement by the two scientist which did not discount the notion of dedication on behalf of the two. Early publication of results preceded by hectic scientific data collection were crucial to the development of intuitive science. It is not intuitive mind alone that can bring proof to the table. It has to backed by positively made effort to prove the method and the matter in a time bound schedule. Rodolphe and Philippe collected and correlated historical data , which was available at the Danisco laboratory and was able to demonstrate that each virus attack was followed by lengthening of the DNA and development of immunity in the bacterial cells. The bacteria was editing its own genome and passing it to generations ahead. It was a novel concepts and thrilling to the curious mind. 
The author further describes the origin of the CRISPR meetings and the initiation of conventions in nomenclature. The chapter then introduces an element of surprise by introducing a controversy - the cas system operates on the DNA- a matter which was rigousrly proven by Sontheimer and Marraffini from the UCLA. It was not a twist but a statement of truth which necessitates the revision of many ideas about how the process was working. 

The chapter intend to convey ideas of lateral thinking minds , nonlinearity of research and need to pursue contextual updates in research since the earlier the lateral streams join in strongershall the flow be. 


Pratyush Chaudhuri 

Wednesday, June 18, 2025

Jumping in- chapter 11 from the Code breaker

Taking notes during reading books is a good habit to cultivate. I am initiating myself. It is a great idea to get along with.
I finished reading chapter 11 of this book and as I reflected back to the idea of this section , I realised that the author had used the section to describe three central ideas or person - Blake Wiedenheft, Martin Jinek and Cas 1. Appreciating the authorship for his beautiful discription of the characters may become unnecessary and repetitive. 
Blakes youth with his adventures and world wide experience reminded me of Charles Darwin. It reconfirms the role of an attentive mind in wild persuit of ideas and nature's magic. The visits to hotsprings in search for thermophilic organisms was familiar but I had never taken interest in the idea. Today read reminded me to search for more and I was thrilled to both see the organisms and how they have been applied in industry. I checked out from a young engineer friend of mine and was excited to know more. 


The authors note how Prof Doudua was struck by Blakes enthusiasm about the work ON Cas 1. - supposedly suggesting the need to keep updated with ideas as it exists today to be ready for a research proposal of tomorrow. There is a significant difference between a person who enjoys science and one who reveals science. The later is a person who is updated and hence thrives on the edge of the known and yet to be known. And I realised that I'd the crucial difference. In my personal experience, the burden of information is a difficult deterrent as is the feeling of comfort after a thrilling idea understood. I have stopped too many time on the way. Young minds should keep abreast in their area of interest and keep brewing the idea till the opportunity comes to work on. 
It was interesting to see how Blake chose to approach the top notch lab in the field and start work. It must have been quite an effort and hubris to go ahead and ask a senior faculty of the field ' ...any idea what is CRISPR?'


The notes on the young Martin Jinek was equally thrilling. Crystallography has too many stories to be mentioned here- from Breg , Rosalind and many others , Martin was another of them. I am excited to read further about his contributions as the story evolves. Martin was the star of this chapter because he was able to determine the crystal structure of one particular enzyme thus showing how it was able to cut the messenger RNA.
Interestingly, the author notes, Jinek and Blake with different backgrounds and personalities became complimentary particles. The attribution the ' particle ' must be very specific and probably revealed later, but the idea of complimentary capabilities appears to repeat it self in the progress of scientific discovery too often to be ignored.

The chapter end with a beautiful discription of Cas 1. It was a fold in the molecule that became so important. I wonder how the team must have interpreted the discovery since the mechanisms must have followed later. Proteins structure definition and later configurations of DNA and RNA has been described since long. Further reading appears to be necessary.

It was wonderful to read and write on this article.

Pratyush Chaudhuri