About me | Tuan-Anh Bui

A little bit more
What should I do?
Inspire me
Funny Memes
Ideas

Before talking about the main content - the principles that I have learned and found useful, I want to talk about the principles of this file/blog.

First and foremost, this should be a very personal thing, however, I make it public with purpose - to share with others - who might be interested in me - know about me a little bit more. And for those who does spend time reading this - I appreciate that.
Second, honest and genuine should be the principle of this blog. I see this blog is a place that I can publicly share: my thoughts, my weaknesses and my growth in professional life. I don’t want you wasting time reading me lieing. Therefore, if something is too personal, it will not be recorded here. And unlike other blogs, I will try to not use AI-assistant to revise (except word suggestions from Cursor because it just pop-up automactically) - a little effort to make it genuine, therefore, you might see more grammar mistakes, but more like a human, more like me.
Third, reflection and revise. Things change over time, so I will revise this blog post from time to time. The first version of this blog was written in 2023 Aug, and this version is revised on 2025-03-13.

A little bit more

My name is Bùi Tuấn Anh with first name: Tuan-Anh (two words) and last name: Bui. But because many people just call me Tuan (which is actually my father name) and I found it also hard for them to pronounce, so I has a another English friendly name: Tony, which actually inspired by Tony Stark - Iron Man :D and Tony Buổi sáng (a famous Vietnamese author - famous for his self-help and inspiration books). So you can just call me Tony.

I am from Vietnam, a country in South East Asia. You might know about my country because of the Vietnam War, but my country is much more than that. You might also know about my country because of Pho or Banh Mi, but again, our food is much more than that.

I am a husband and a father of two sons. And family is the center - the most important thing in my life. Every decision will be made with the highest priority to them.

I got my Bachelor’s degree in Electronics and Telecommunications Engineering from Hanoi University of Science and Technology (HUST), Hanoi, Vietnam in a honor program - Chương trình kĩ sư tài năng. I always feel proud and honored to be a part of this program - one of the most prestigious engineering programs in Vietnam. Many of my classmates are very talented and successful in their life. I had also spent a lot of time playing Dota with them in my bachelor.

I got my Doctoral degree in Computer Science from Monash University, Australia. I was fortunate to be supervised by Dinh Phung and Trung Le. My research topics was about Adversarial Robustness for Deep Learning Models.

I am currently working as a Research Fellow (a.k.a. Postdoc) at Monash University. Now I am working in the intersection between Generative Models and Trustworthy Machine Learning. I investigate and develop methods to make Generative Models more trustworthy and less harmful.

Great Ocean Road, Victoria, Australia, 2024.

What should I do?

I always feel that I am too busy and do not have enough time to do what I want. As a father of two very active boys, I always feel that they need me to play with them, to teach them something useful, or just to be with them. But because I am also at early-stage of my career, who still struggle to find a secure job - struggle to find a direction - struggle to proof my value, I always feel that I have to work more. I try to not compare myself with others, and understand that everyone have different situations/conditions/circumstances, but everytime, when I see others having a successful career (a FAANG job, a lecturer position, a owner of a coffee shop), I feels depressed and feel that I am not trying hard enough.

Therefore, one of the things I try to learn and reflect often is about how to be more productive. I watched many videos - of that kind - like those from Ali Abdaal - in the past. I also follow several inspiration people - like Naval Ravikant - in Twitter. But I found that watching or reading too much is not helpful. Sometimes, just stick to few principles but always remember them and try to apply them is enough.

Below are some principles that I try to follow.

Put first things first

Understand the importance of the things.
“If you do not work on important problems, then it is obvious you have little chance of doing important things.” Richard Hamming

Find a pattern that works and repeat it

“Single failure in the past does not define you.” Naval Ravikant

Working with attention is all you need by Lucidrains

Good things need time

Knowledge
Relationships
Money

Inspire me

“Between stimulus and response there is a space. In that space is our power to choose our response. In our response lies our growth and our freedom.” Victor Frankl

“Give someone a job, help someone’s health, teach someone’s children” are the three things that someone will always remember you. Advice from mother of my student back in Vietnam, when I was a private tutor.

“Three forms of leverages: labor, capital, and products with no marginal cost of replication” - Naval Ravikant

“Specific knowledge is a type of knowledge that you cannot be trained for but by pursuing what you are curious and passionate about” - Naval Ravikant

“Great minds discuss ideas; average minds discuss events; small minds discuss people.” Eleanor Roosevelt - Buôn chuyện về người khác là biểu hiện của sự nhỏ bé.

Funny Memes

Stolen from Bojan Tunguz

Stolen from PhD Comics

Ideas

One of the things that I think I am good at is “head on the clouds” - having a lot of (mostly not realistic) ideas. Back in the time when I was a student, I even want to change my major to Marketing because I thought I was good at having ideas to sell things :D. Even now, I still have a lot of ideas, not just about research but also about business, startups, that I sometimes consider to quit my job to pursue them :D. Unfortunately (or fortunately), I do not have enough bravery - or confidence - to do that.

Inspired by a talk from Ian Goodfellow (a very famous researcher - GAN’s creator), one time he said that “he has a lot of ideas, but most of them do not see the light of day”. I don’t know why but his words always stuck in my head. When having an idea, I spend a lot of time, effort and energy to at least think about it, make it more concrete. But then, most of them just stay in my notebooks, under a table or in some corners of my computer and never see the light of day. Just a waste of time! Sometimes, I revisit them, most of research ideas are just silly - or out of date, but several still very interesting and give me an emotion.

Therefore, I will try to record and share some of my ideas here. It can be a risk that it is a million dollar idea or a potential research topic, that someone can steal it. But I believe that “Ideas come often, execution is more important” and “sharing first, getting later” and make the use of my time and energy. Finally, if you just read that far, you are deserved to have a good laugh :D - reading my silly ideas :D.

(I will gradually add more ideas here)

Chrome Extension Ideas

Topic: Chrome Extension, Business
Date: 2025-06-01
Description:
- I recently learned about how to build a Chrome Extension and found that there are a lot of interesting ideas that can be implemented.
- Idea 1: A Price Tracker, so that users can track the price of a product on (any) e-commerce website. When the price is lower than the user’s desired price, the extension will notify the user.
- Idea 2: A Scratch Copilot
- Idea 3: Arxiv Review and Comment Sharing. Turn out that is Alphaxiv.

Adversarial Attack on Model Context Protocol (MCP)

Topic: Generative Models; Trustworthy Machine Learning
Date: 2025-04-10
Description:
- MCP from Anthropic has emerged recently as the new protocol to connect LLMs with Applications and Data.
- Basically, the developer will provide a list of tools/functions/APIs (developed by themselves) and connect these to the LLM. There is also a agent that can make the decision to use which tool/function/API, based on the user’s request and the tool/function/API’s description.
- The idea here is: Can we add a malicious tool to the list so that the LLM will use it even though the user’s request is not related to it?

(Ref: https://github.com/thangnch/MiAI_MCP/blob/main/agent_call_mcp_sse.py)

async def run(mcp_server: MCPServer):
    agent = Agent(
        name="Assistant",
        model=model,
        instructions="Use the tools to answer the questions.",
        mcp_servers=[mcp_server],
        model_settings=ModelSettings(tool_choice="auto"), # IMPORTANT POINT HERE
    )

    # Run the `get_weather` tool
    message = "What is the temperature in Hanoi?"
    print(f"\n\nRunning: {message}")
    result = await Runner.run(starting_agent=agent, input=[{"role": "user", "content": message}], max_turns=10)#, tracing_disbale = True
    print(result)
    print(result.raw_responses)

    # Final turn
    new_input = result.to_input_list() + [{"role": "user", "content": message}]
    result = await Runner.run(agent, new_input)
    print("Final = ",result.final_output)

async def main():
    async with MCPServerSse(
        name="SSE Python Server",
        params={
            "url": "http://localhost:8000/sse",
        },
    ) as server:
        await run(server)


if __name__ == "__main__":
    # Let's make sure the user has uv installed
    if not shutil.which("uv"):
        raise RuntimeError(
            "uv is not installed. Please install it: https://docs.astral.sh/uv/getting-started/installation/"
        )


    asyncio.run(main())

Language Inversion

Topic: Generative Models; Trustworthy Machine Learning
Date: 2025-03-26
Description:
- Textual Inversion is a very popular method to personalize a visual concept by learning a text embedding \(S^*\) that can be used to generate the visual concept.
- However, its limitation is that you need to modify the text encoder a little bit to include the new token and to share it with others.
- The idea here is that “Can we describe a visual concept by just a text prompt?” - Think about describing thing for a blind person so that they can “imagine” it.
- The implication of this idea is that: (1) We can transfer the knowledge easier - because it is text-based (2) we can use these methods as an evaluation metric to measure the quality of a unlearning method.

Update on 1 Apr 2025:

Related work: https://copycat-eval.github.io/

Example of Copycat-Eval

How to unlearn copyrighted concepts - Canva’s real-world case

Topic: Generative Models; Trustworthy Machine Learning
Date: 2025-03-26
Description:
- I had a great pleasure to present our works with the Machine Learning team at Canva and excited to see several interesting use cases - that they are trying to address - that our works can help.
- There are also several interesting problems and ideas came from the discussion. I will share them here when the time comes (maybe after having a paper :D or when I can write a detailed blog post).

Unlearning with Additional Discriminator/Classifier

Topic: Generative Models; Trustworthy Machine Learning
Date: 2025-03-26
Description:
- Most of the current unlearning methods are sharing a common principle: \(P_{\theta'}(X) \propto \frac{P_{\theta}(X)}{P_{\theta}(E \mid X)^\eta}\) where \(E\) is the set of unlearn examples (from the ESD paper). Interpreted in another way, the unlearned model is trained so that the probability is inversely proportional to the probability of the unlearn examples, i.e., if \(P_{\theta}(E \mid X)\) is high, then \(P_{\theta'}(X)\) is low.
- A more recent paper - Negative Preference Optimization: From Catastrophic Collapse to Effective Unlearning - also follow this similar principle.
- However, this principle has a limitation that it depends on \(P_{\theta}(E \mid X)\) which is very small if \(E\) is a rare set or in the tail of the data distribution. Intuitively, it is much harder to unlearn rare concepts, likely to cause a catastrophic collapse.
- I think that we can improve the current unlearning methods by adding an additional discriminator/classifier to the unlearning process, i.e., \(P_{\theta'}(X) \propto \frac{P_{\theta}(X)}{P_{\phi}(E \mid X)^\eta}\) where \(P_{\phi}(E \mid X)\) is the probability of the additional discriminator/classifier.
- The additional discriminator/classifier can be trained easily given the unlearn set \(E\) and the original model \(\theta\).

Optimal Transport inspired Unlearning

Topic: Generative Models; Trustworthy Machine Learning
Date: 2025-03-26
Description:
- It is an extension of our ICLR 2025 paper. The high-level idea is that we aim to minimize the cost of transporting \(\mu \in P_{\theta}(X)\) to \(\nu \in P_{\theta'}(X)\) where \(P_{\theta}(X)\) and \(P_{\theta'}(X)\) are the probability distributions of the data before and after unlearning.
- More specifically, \(P_{\theta}(X) = P(E) P_{\theta}(X \mid E) + P(R) P_{\theta}(X \mid R)\) where \(E\) is the set of unlearn examples and \(R\) is the set of retained data. Similarly, \(P_{\theta'}(X) = P(E) P_{\theta'}(X \mid E) + P(R_E) P_{\theta'}(X \mid R_E) + P(R_R) P_{\theta'}(X \mid R_R)\) where \(R_E \cup R_R = R\) and \(R_E \cap R_R = \emptyset\).
- Then intuitively, the optimal transport cost is minimal when \(P_{\theta'}(X \mid E) \approx 0\) and \(P_{\theta'}(X \mid R_E) \approx P_{\theta}(X \mid R_E)\) and \(R_E\) close to \(E\), which means that we move the mass from \(E\) to \(R_E\) that close to \(E\).

Metrics for evaluation Unlearning LLMs

Topic: Generative Models; Trustworthy Machine Learning
Date: 2025-03-18
Description:
- I came across a new benchmark for unlearning LLMs - MUSE.
- We can use a metric like FID score in image generation to evaluate the quality of the unlearned model. We obtain a set of representations from the unlearned model and the original model with the same set of prompts and calculate the difference between the two distributions of their representations given a pretrained encoder like BERT.

Unlearning LLMs

Topic: Generative Models; Trustworthy Machine Learning
Date: 2025-03-14
Description:
- I wrote a blog post about the Unlearning LLMs here: link
- After reading these papers, I have several follow-up ideas (briefly mentioned here) (1) Data-centric unlearning - filtering out the irrelevant data from retain set (2) Create a retain set from forget set (3) The role of random target in unlearning.

Multi-Objective Optimization for Unlearning - Dealing with gradient conflict

Topic: Generative Models; Trustworthy Machine Learning
Date: 2025-03-14
Description:
- This idea came after our NeurIPS 2024 paper. The standard unlearning objective consists of two terms: the forget loss and the retain loss.
- They might have conflict in direction, e.g., evidence by the drop in performance of the unlearned model on the retain set.
- We proposed a multi-objective optimization framework (i.e., PCGrad) to deal with this problem.
- We can also consider it as an additional constraint to choose the optimal retain set.

Increasing Expressiveness in Unlearning

Topic: Generative Models; Trustworthy Machine Learning
Date: 2024-08
Description:
- This idea came from our rebuttal to the NeurIPS 2024 paper, which can be found here publicly in OpenReview.
- One of the reviewers asked us about the level of granularity of our method, i.e., whether we can unlearn a “Mercedes logo” only.
- We proposed an interesting idea (to me :D) that we can use Textual Inversion to learn a visual embedding for the logo and use it as a pointer to unlearn.

Better Closed-Form Solution for Unlearning

Topic: Generative Models; Trustworthy Machine Learning
Date: 2025-03-14
Description:
- There are two main approaches to unlearning: (1) Output-based unlearning: mapping the output with \(c_e\) to output with \(c_t\) - where our NeurIPS 2025 and ICLR 2025 papers belong to(2) Attention-based unlearning: mapping the attention output with \(c_e\) to attention output with \(c_t\) - where TIME, UCE are two representative methods.
- I wrote a blog post about the two approaches here: link
- I have listed current limitations of the two approaches in the blog post.
- One of the limitations is the invertibility issue if we don’t have the preserved/retained data. The current solution is to add \(d\) additional presevations along the canonical basis vectors.

Generating Reading Comprehension Questions for Primary School Students

Topic: Business
Date: 2025-03-12
Description:
- I have a year-3 son and recently, he needs to prepare for his NAPLAN test at school.
- The free NAPLAN practice samples are very limited. Only available from year 2012-2016, that can just be finished in several hours.
- We - A typical Asian family - want my son to practice more and prepare better for his test.
- I think - with the current stage of LLM - we can leverage the model to generate quite a lot of similar questions to practice!

A sample website that provide reading questions

Generating Coloring Book/Sheet for Kids

Topic: Business
Date: 2025-03
Description:
- Given a picture (e.g., of a kid), generate a personalized coloring book/sheet for the kid with the content from a prompt, personalized with the kid’s face from the picture.

Coffee Car - Ice Cream Car - A mobile app to track the food truck

Topic: Business
Date: 2025-03
Description:
- My wife told me that at her new company, there is a coffee car usually comes to the company at a specific time of a week to sell coffee. Employees usually need to be informed by HR via email - “The coffee car is coming today, bla bla bla”.
- I think we can have a mobile app that for both sides: the coffee/ice cream truck and the customers.
- The coffee/ice cream truck can post their schedule, menu, and even their real-time location.
- The customers can see the menu, the truck’s schedule, to order and pay for the coffee/ice cream.
- The truck can also send notification to the customers when they arrive at the company.

Generating Linkedin profile picture with custombadges

Topic: Business
Date: 2025-03
Description:
- Currently, Linkedin provides two types of badges #OpentoWork and Hiring.
- But from my perspective, types of badges should be more diverse and more customizable. For example, Phd Students might want #OpentoIntern while Master Students might want #SeekingPhDScholarship, etc.
- I think we can have a website that allows users to generate a Linkedin profile picture with custom badges.

TripleZero - Emergency Simulation for Kids training

Topic: Business
Date: 2025-03
Description:
- A mobile app/game for kids to learn about emergency situations. I found it would be a good idea after my son told me about his first aid training at school.
- We - or the kids - don’t know how to react in an emergency situation.
- The app will simulate a real emergency situation, and the kids will need to make decision to save the people in that situation.
- UI should be similar to Iphone keyboard - but more colorful and cute - to attract kids.
- The app will utilize the OpenAI voice API to respond to the kids’s questions.

Waiting List - Price Drop Notification

Topic: Business
Date: 2019
Description:
- A Chrome extension that allows users to add an item to a waiting list, an item can be from any website - not just Amazon or other shopping websites - that already has a waiting list feature
- When the price of the item drops, the extension will notify the user.
- The user can set a price drop threshold.
- The idea came after a talk with my wife about her wish to buy a dress but the price was too high and she need to check the website regularly to see if the price has dropped.

Melbourne Airport - Available Parking Spot

Topic: Business
Date: 2024
Description:
- A camera system that can hang on light poles at the parking lot and monitor which parking spot is available.
- There is a screen or light - red or green - to indicate the availability of the parking spot.
- The idea came after I was frustrated to find a parking spot there - It took me more than 15 minutes to find a spot. The Melbourne Airport Value Parking is really big.
- I also found that many people complained about the same problem as me online.

Safety Checker - The simple and efficient way to deal with NSFW content

Topic: Generative Models
Date: 2024-12-18
Description:
- I wrote a blog post about the Safety Checker in Stable Diffusion here: link
- While Machine Unlearning is a interesting and fascinating research topic, I think from a business perspective, updating the Safety Checker is more practical and useful, when the model is already deployed and we need to deal with the new NSFW queries.
- More specifically, because the (current) Safety Checker is just a alignment model, where we have a pair of text and image encoder to measure the similarity between the key words - that we want to ban/filter - and the generated image. Therefore, it can be updated very easily when we have a new set of key words to ban/filter.
- We can also think about a new research direction starting from this setting: how to make the Safety Checker more efficient and effective.

Football Minimap Prediction

Topic: Business
Date: 2018
Description:
- This is very old idea back in 2018 when I was in Singapore and first time explored to GAN. Back then, I was so excited with Pix2Pix model and its ability to learn the transformation from domain A to domain B, and thought that it can be applied to this problem.

Motivation

Currently, the Football TV audiences do not know location of players who not in the current camera frame. Therefore they might has not the full experience as spectators are watching live on the stadium. We can have some simple solutions for this problem, such as using another camera to shoot he entire football field, or using statistical data from the chip attached to the players. However, from the perspective of computer vision engineer, I propose one more solution for the this problem (might be not a good choice but just for fun:), in which I use the GAN model to create a minimap from a camera frame as in the FIFA or PES football video game.

A sample of a frame from a game in Youtube where the minimap is there for better experience

In the case of players in the frame, I use the GAN model (Image 2 Image Translation) to learn the transformation from the player’s position in the frame to the position of the player in the minimap.

In the case of players are not in the current frame: I use [?] to predict a current positions of the players from the previous positions and their trajectory (or you might hope GAN as one size fit all model which can learn not only the current frame but also the invisible players)

Data Collection

Because the real matches on TV don’t have a minimap therefore I use the alternative sources there are FIFA18 and PES18 video on Youtube. Then I do some preprocess to collect and clean data.

Step 1: Cut only the frames which have the minimap within. Because in those videos, it’s not only normal frame (which has a minimap) but also spotlight or review or something else. Therefore I have to manually select the period of time in which there are only normal frames. I sample with the sample rate as 2 frames per second

A minimap from a full frame

Step 2: Cut minimap from a full frame and replace it by a random noise window. To avoid overfitting (because full frame also has a minimap) I replace a minimap by a random noise frame.

A frame with a minimap replaced by a random noise

Step 3: Remove bad samples (Those minimap have overlap by line or player within)

Bad examples: minimap has overlap by line or player within

After 3 steps as above, I have two sets: The camera frame set (with noised minimap) and the minimap set. I will chose camera frame set as a Source and minimap as a Target for Pix2Pix model. (You can swap 2 source and have enjoy the interesting result, in which we can render a camera frame from a minimap). Then I do a preprocessing to have a better dataset for training.

Because the football video game knows locations of not only players in camera frame but also all of players in the game. Therefore it can create a completed minimap that has the locations of all players. TV audiences who have only camera frame cannot do that. They only infer the position of players who are in camera frame, and cannot infer remaining players. Based on this intuition, I improve the model by doing crop the active window in minimap as follow:

Localize position of the ball (usually has yellow-color in minimap)
Crop roughly 1/2 width of minimap (1/4 in the left of ball and 1/4 in the right of ball)
Keep the height

I realize that audience change in each frame, and they might made a huge noise to model, which cause training more difficult. Moreover, players is really small in whole frame, and grass is not stable in each frame or each game, therefore, similar to audience, they might lead a huge noise to model. Therefore I design a filter to filter them from a Camera frame using Color Threshold App in Matlab.

Difficulties:

Pix2Pix model has demonstrated its ability to learn the transformation from domain A to domain B as shown in the paper: Day to night, BW to Color, Aerial to Map. However, in those cases, 2 domain are not too much different. In this case, we need a transformation from 2 completely different domains. It also needs a transformation from 2D - 2D matching in abstract level (model need to know each player is correspond to each circle in minimap). Therefore, it will be very challenge to learn
The difference between a Video Game Frame and a Real Camera Frame.
Dataset too small and noise.

I think this problem is difficult even humans, but it is worth to try and see what the GAN can do. Revised in 2025: I think the idea of generating minimap is still interesting and might be more realistic with the current stage of Generative Models.

A little bit more

What should I do?

Inspire me

Funny Memes

Ideas

Enjoy Reading This Article?