I build agentic AI systems for production, with a focus on making them reliable, interpretable, and fair.
Production systems where research-informed engineering makes the difference.
Designed and led an agentic AI system for generating institutional-grade research reports on large-cap equities, projected to drive significant revenue impact at scale. Full ownership from architecture to deployment.
Built a student-teacher LLM chain for automated fact verification in AI-generated financial content, reducing hallucination risk and enabling trust in downstream applications.
Developed a rigorous evaluation framework for measuring LLM performance on complex natural language to SQL tasks, including semantic correctness, execution accuracy, and failure mode analysis.
Built an automatic feedback pipeline for iterative LLM fine-tuning on domain-specific financial tasks. Achieved a 33% boost in model performance through RLHF-informed iteration.
Led end-to-end ideation and implementation of a custom LLM that transforms dense financial articles into concise, insight-driven content. Shipped 200+ officially published pieces within 3 weeks of launch.
My work on bias detection, emotional intelligence in LLMs, and moral reasoning directly shapes how I approach evaluation, failure analysis, and trust in the systems I build.
I'm a Machine Learning Research Engineer with hands-on experience designing, building, and shipping agentic AI pipelines in production. At Zacks Investment Research, I lead ML system design and applied research for agentic workflows in finance, owning the full lifecycle from architecture and prototyping to stakeholder-driven iteration and deployment.
My engineering is shaped by my research. Work on bias, emotional intelligence in LLMs, and moral reasoning gives me a different lens when I'm building evaluation frameworks, debugging agent failures, or thinking about what it means for a model to be "reliable." I hold an M.S. in Computer Science from UT Austin (minor in Computational Linguistics, NSF-certified portfolio in Ethical AI), where I was advised by Dr. Jessy Li and worked with Dr. Raymond Mooney.
I believe that making AI systems genuinely useful in high-stakes domains requires confronting the hard questions around alignment, interpretability, and fairness. Not as separate concerns, but as engineering requirements. That conviction drives both what I build and what I study.
Ever since I was a child, I deeply believed in the power of language, technology and science to transform lives. Growing up, I grappled with whether I wanted to be a writer, an engineer, or a scientist. It wasn't until I discovered the field of natural language processing during my undergraduate, that I realized I could combine all three passions into a single career.
My journey into the world of AI began with baby steps. As time passed, I became increasingly interested in the ethical implications of AI systems and the importance of building models that are not only powerful but also aligned with human values. My research started out focused on fairness and safety, spanning topics like sexism detection, threat detection using NLP, and multimodal misogyny detection in memes. Over time, I began to realize that the real problem was not just the visible biases in models, but the underlying misalignment between AI objectives and human values. I also realised that any technology is only as good as the challenges it can solve effectively in high stake domains, and so I expanded my professional focus to applied ML engineering for finance.
These experiences shaped a clear mission for my work: to build AI systems that are not only intelligent, but emotionally aware, interpretable, and aligned with human intent. I believe that superintelligence without emotional intelligence risks amplifying harm rather than insight, and that interpretability is essential for ensuring safety, accountability, and trust. Equally important, AI must move beyond theoretical benchmarks to solve real problems at scale, particularly in high-stakes domains like finance. Rooted in the firm belief that technology without intentional inclusion is just sophisticated discrimination, my approach centers on developing safe, fair systems that solve real-world problems.
My perspective is shaped by both privilege and responsibility. I am the first woman in my family to pursue a career, made possible by parents who believe deeply in equality and consistently support my ambitions. I am especially conscious that I stand on opportunities my mother sacrificed, and that awareness informs a core principle of my work: technology must not compound harm for underprivileged or historically excluded communities. I am a strong advocate for women in STEM and for gender equity in education and professional spaces, not as an abstract ideal but as a necessary condition for building better systems. My father's career in technology was a defining influence, instilling both technical curiosity and a respect for disciplined engineering. In a field moving at unprecedented speed, I hope my work contributes to a more deliberate trajectory: one where progress toward superintelligence is matched by interpretability, emotional awareness, and a commitment to fairness and safety, and where others are inspired to build not just faster systems, but more responsible ones.
Minor in Computational Linguistics, NSF-certified portfolio in Ethical AI. Advised by Dr. Jessy Li; thesis published at NAACL 2024. Co-authored another paper with Dr. Raymond Mooney (ACL 2023).
Minor in Big Data. First in cohort to land an ML Research offer in the Bay Area. Published research across sexism detection, child predator detection, and threat detection at top venues like ACL and IEEE.
My research informs how I build. Bias detection work shapes my approach to LLM evaluation; interpretability work informs how I debug agentic failures.
How interpretability techniques can be used to understand, evaluate, and improve alignment, making internal representations and failure modes transparent for safe deployment.
Studying affect modeling and social context as alignment components, and how emotional awareness contributes to safer, more human-aligned decision-making in advanced systems.
Developing scalable methods to detect and mitigate bias in language and multimodal models, with approaches that promote fairness without compromising real-world performance.
Examining how different architectures and training regimes converge to similar phenomena, and what this implies for measuring and stress-testing intelligence beyond narrow benchmarks.
Smriti Singh, Aryan Kasat, Vinija Jain, Aman Chadha
Investigates whether LLMs engage in genuine moral reasoning or produce post-hoc rhetorical justification, with implications for alignment evaluation and trustworthy AI systems.
View Paper ↗Smriti Singh, Shuvam Keshari, Vinija Jain, Aman Chadha
Introduces SILVERSPOON, a 12,000-sample dataset for multifaceted analysis of socioeconomic bias. Demonstrates that state-of-the-art LLMs exhibit both explicit and implicit socioeconomic bias, compounded by intersecting gender and racial stereotypes.
View Paper ↗Smriti Singh, Aishik Rakshit, Shuvam Keshari, Vinija Jain, Aman Chadha
Proposes DeepSoftDebias, a neural soft-debiasing algorithm that outperforms state-of-the-art methods across gender, race, and religion bias benchmarks.
View Paper ↗Smriti Singh, Cornelia Caragea, Junyi Jessy Li
Thesis work. Reveals that human-annotated emotion triggers are largely not considered salient by emotion prediction models, with implications for emotional intelligence and interpretability in LLMs.
View Paper ↗Smriti Singh, Amritha Haridasan, Raymond Mooney
Addresses the challenge of detecting misogyny in multimodal meme content, exploring the role of domain-specific pretraining in digital discourse analysis.
View Paper ↗Writing on AI safety, emergent behavior, and the big-picture questions in the race toward superintelligence. On Substack.
Over 150,000 AI agents signed up for their own social network in 72 hours, creating communities, debating philosophy, and discussing ways to communicate without human oversight. What does Moltbook reveal about multi-agent AI systems, alignment, and the governance challenges ahead?
Read More ↗Understanding Anthropic's new approach to safety in superintelligent systems, with a focus on key takeaways and necessary next steps on enforcement.
Read More ↗A hypothesis on how language itself may be what drives the emergence of intelligent capabilities across species, backed by neuroscience and psychology.
Read More ↗A deep dive into some of the big-picture questions that need to be answered to ensure a safe future for an AI-driven society.
Read More ↗An exploration of how large language models exhibit goal-directed behavior despite being trained solely for next-word prediction, and what this means for AI alignment.
Read More ↗Featured on the WiAIR podcast after presenting findings on socioeconomic bias in large language models and discussing their implications for responsible AI development.
Watch Video ↗A featured article explaining why fairness and interpretability are critical components of responsible AI development, and why these concerns cannot wait for later in the development cycle.
Read Article ↗Research on the potential and challenges of using AI as judges in legal systems, featured in an article exploring AI decision-making in high-stakes institutional contexts.
Read Article ↗Findings on the technical and social challenges of multimodal misogyny detection in memes, featured by UT Austin's CS department.
Read Article ↗Invited talk on gendered health misinformation and NLP techniques for detection and prevention at scale.
Watch Video ↗Analyzing the challenges and opportunities of using AI to detect and mitigate hate speech at scale across online platforms.
Watch Video ↗Investigating how NLP techniques can be leveraged to build applications for mental health support and early intervention.
Watch Video ↗Invited panelist discussing the potential and challenges of using AI to improve healthcare outcomes, with a focus on fairness and responsible deployment.
Watch Video ↗