Week 13 & 14

Coursework

Neural Networks and Deep Learning:

This week was basically a deep learning bootcamp. We started with the fundamentals in TensorFlow/Keras. How the Sequential API works, how layers stack, and why activation functions matter.

I built an ANN on MNIST with exactly three dense layers, a Flatten layer, and dropout. Then I visualized everything: raw images, training/testing splits, the first and last epochs, loss/accuracy curves, and a final confusion matrix to see where the model stumbled. Check out the code I wrote here!

After that, we moved into CNNs. This part really forced me to understand image shapes, filters, cross-correlation, max pooling, and flattening by actually illustrating every step. It made the CNN architecture much easier to understand. Check out the code I wrote here!

Next came RNNs and LSTMs. I trained both on a sequential/text dataset and compared how they handled dependencies. The LSTM’s ability to remember longer patterns felt very real once I saw the training curves side-by-side. Check out the code I wrote here!

On the theory side, I worked through backprop for  in an RNN unrolled over four time steps, which was messy but very interesting.

We wrapped up with Transformers and LLMs: attention, GPTs, and how simple “predict the next word” objectives scale into the giant models we use today. Something similar to what I've been learning in the NLP class.  

Natural Language Processing (NLP):

We continued building on last month’s foundations and went deeper into how modern LLMs are actually trained and aligned. We covered the full pretraining/post-training pipeline:  pretraining, instruction tuning, preference alignment, reward modeling, and reinforcement-learning-based alignment. Then we explored how models learn from human preferences using the Bradley-Terry model, how scoring (reward) models are built, and why reference vs. learned models matter.

We also learned two major optimization approaches for alignment: PPO (Proximal Policy Optimization) and DPO (Direct Preference Optimization), including the math, the intuition, and the training requirements for each.

On the prompting side, we covered in-context learning, system prompts (like Claude’s), task definitions, demonstrations, chain-of-thought prompting, zero-shot CoT, test-time compute, best-of-N sampling, beam search, and revision-based reasoning approaches.

We're now getting into information retrieval in the age of LLMs: ranked retrieval, ad-hoc retrieval, relevance scoring, sparse vs. dense retrieval, tf-idf and PMI, inverted indexes, precision/recall curves, MAP, and how modern LLM systems replace classic IR methods with single-encoder and bi-encoder dense retrieval.

Since this is a lot of concepts, I decided to start a small NLP 101 series where I break down what I’m learning in a simple, friendly way, both to reinforce my own understanding and to share it with others. Check out the blog! 

Information Visualization:

We wrapped up classes, and most of the sessions were classmates presenting their final projects. It made me realize how much I enjoyed this course and how important it is to keep practicing good visualization habits. So I’m starting a small weekly ritual: I’ll take one “bad” chart and redesign it into something clearer and more meaningful. Think of it as a tiny InfoViz application each week. I’m still deciding what to call this series, but I’m leaning toward something like Viz Fix. Open to ideas! 


Snowflake BUILD

I also spent time in the Snowflake Data Heroes Community over these two weeks and participated in both the AI Bootcamp and the Data Engineering Bootcamp.

 
 

One of the highlights was the luminary session with Andrew Ng (My first ML Teacher), Swami Sivasubramanian (VP, Agentic AI at AWS), and Sridhar Ramaswamy (CEO of Snowflake), where they talked about the AI blueprint for the next decade.



I attended a few hands-on sessions too, including Smart SAS Converter: Build, Migrate, Modernize, a live demo showing how to use Snowflake Cortex LLMs inside a Streamlit app to convert legacy SAS code into modern SQL or Python. The session covered model switching (snowflake-arctic, mistral-large, gemma-7b), prompt chaining, and how this can scale refactoring across an entire organization.

Another session I liked was Postgres AI: A RAGs to Riches Story, which tied together retrieval-augmented generation and database optimization; super helpful for thinking about real-world data engineering workflows.


Boulder Climate Ventures: Ag Tech & Nuclear

Agriculture both contributes significantly to global emissions and faces growing risks from climate change, which makes it one of the most important sectors for climate innovation. This session looked at how Ag Tech is advancing sustainability, improving resilience, and shaping the future of global food systems.

Nuclear energy is a low-carbon powerhouse but also one of the most debated climate solutions. This session unpacked nuclear’s role in the clean-energy transition, covering reliability, safety, innovation, regulation, and the long-term sustainability questions shaping the field.


Even though these weren’t “pure data science” talks, both sessions ended up being really relevant to me. Ag Tech and Nuclear are industries where data is becoming the backbone of innovation.  
 
From satellite imagery and remote sensing to yield prediction models, sensor-driven automation, safety monitoring, and large-scale simulation. Listening to the speakers made it clear how much these sectors rely on machine learning, forecasting, anomaly detection, optimization, and large datasets to solve real problems. 
 
It gave me a better sense of where data scientists fit inside climate tech and how technical work directly connects to climate resilience, energy reliability, and large-scale sustainability challenges.


Boulder AI Builders 

I also made it to the Boulder AI Builders meetup at Founder Central by Sweater, the last one of the year. This one had a really strong lineup of demos. I heard from DekaBridge, which had one of the coolest products of the night, along with ZiOsec, Ranger, Datadog, and a Claude demo from Anthropic.


It was very engineering-focused, people showing what they’re actually building, what’s breaking, and how they’re using AI in real systems. For me, events like this are useful because they’re a reality check on where the industry is heading. You get to see the kind of data pipelines, infra decisions, evaluation setups, and model-product integrations that companies are actually using.


It’s a nice contrast to academic examples; very applied, very “here’s what works in production.” Good energy, great demos, and a solid way to stay connected with the builder community in Boulder. 



AI and the Future of Copyright Politics

I also attended a session on how generative AI is reshaping U.S. copyright politics. The conversation highlighted how the rapid growth of LLMs has pulled copyright law into the center of AI regulation, especially as models are trained on massive amounts of copyrighted content. Different groups: creators, rightsholders, policymakers, AI companies, and public-interest advocates are suddenly being forced into new alliances and disagreements that didn’t exist before.

The event explored both the forward-looking questions (fair use, litigation, the role of the U.S. Copyright Office, and future regulatory frameworks) and the backward-looking implications for remix culture, accessibility, consumer rights, and how communities engage with AI-generated content. Even though it wasn’t a technical session, it gave a very helpful framing of how policy shapes what we can and cannot build. It's something that’s becoming increasingly important for anyone working with AI.


 AI Advantage Summit

I also stopped by the AI Advantage Summit by Dean Graziosi and Tony Robbins this month. It wasn’t really aligned with the kind of technical, engineering-focused content I usually look for, but it was still a good reminder of how broad the AI space has become and how many different audiences it tries to reach. Helpful perspective, even if not directly relevant to my work.



Datathon

We also had our MSDS Datathon this month, hosted with DaSSA, INFO Buffs, and HackCU at the CMDI Studio. The day started with a GitHub workflows session by CU Libraries, which was a good reminder of how much cleaner collaborative projects feel when version control is set up properly.

Then we moved into fine-tuning LLMs with Leo, where we experimented with LandingLens. It was interesting to see how quickly you can prototype computer vision ideas when the tooling removes most of the setup overhead.



Jay Ghosh’s session on choosing between traditional ML models and LLMs was genuinely helpful. It reinforced the idea that a bigger model isn’t always a better model, and that model choice still comes down to context, constraints, and what you’re actually trying to solve.

We wrapped up with a Devpost session by Uditanshu. He shared how much presentation and storytelling shape the final impact of a project, along with the code behind it.



I worked on a project, but ended up being a little late for the submission. Still, it was a fun, high-energy day: building with LLMs, meeting people from different departments on campus, getting feedback from mentors, and being surrounded by a community that genuinely enjoys making things.



I think this is how I’ll end my blogs from now on, with a little piece of Boulder. I really wish I’d captured the Aurora Borealis better, but it was so pretty. Did you get to catch a glimpse of it, too?

Thanks for reading along. 
See you next time! :)

Comments

Popular posts from this blog

Weeks 15 - 18

Weeks 19 - 22