A Learning Log

Too much personal stuff, don't read it!

Learning targets

This is a learning log to keep track of what I have learned each day. To avoid distraction and fluctuation, I will set some long-term learning targets that needed a disciplined and consistent effort to achieve.

2023-10-03

(#Coding) How to change Github Page theme to a new one?

I was quite happy with the AcademicPages theme which is more than enough to show Publications or simple Blog posts. However, when I want to blog more seriously, the AcademicPages theme shows some limitations (or maybe because I didn’t dig deep enough). For example, it does not support image captioning, the table of contents is not automatically generated, the code block is also not highlighted properly, etc. Therefore, I tried to change to new theme, and, I had been falled in love with the al-folio theme which has all the features that I need. The only problem is that I don’t know how to change the theme without losing all the previous posts. The installing instruction is quite simple but just for a new blog created from scratch. After some googling and trying, I found a solution that works for me. Here are the steps:

(#Coding) How to create a draft post in Jekyll? Source

(#Coding) 75 Leetcode Questions

I start a challenge to complete 75 Leetcode questions in less than 75 days. I will log my progress in this blog post.

2023-09-28

(#Finance) The First Home Super Saver Scheme is the best way to save for a deposit by Kuan Tian

(#Finance) FIRE for Aussies by Kuan Tian

(#Finance) How to reduce tax via property investment by Kuan Tian again. The third video in just one morning, I am falling into a rabbit hole because his videos are really good. (Who doesn’t love :money_with_wings:, especially a procrastinated almost-graduated PhD student :joy:)

2023-09-27

(#Research) A new perspective on the motivation of VAE (by Dinh)

Mathematically, we want to minimize the KL divergence between \(q_{\theta} (z \mid x)\) and \(p(z \mid x)\):

\[\mathcal{D}_{KL} (q_{\theta} (z \mid x) \parallel p(z \mid x) ) = \mathbb{E}_{q_{\theta} (z \mid x)} \left[ \log \frac{q_{\theta} (z \mid x)}{p(z \mid x)} \right] = \mathbb{E}_{q_{\theta} (z \mid x)} \left[ \log q_{\theta} (z \mid x) - \log p(z \mid x) \right]\]

Applying Bayes rule, we have:

\[\mathcal{D}_{KL} (q_{\theta} (z \mid x) \parallel p(z \mid x) ) = \mathbb{E}_{q_{\theta} (z \mid x)} \left[ \log q_{\theta} (z \mid x) - \log p(x \mid z) - \log p(z) + \log p(x) \right]\] \[\mathcal{D}_{KL} (q_{\theta} (z \mid x) \parallel p(z \mid x) ) = \mathbb{E}_{q_{\theta} (z \mid x)} \left[ \log q_{\theta} (z \mid x) - \log p(x \mid z) - \log p(z) \right] + \log p(x)\] \[\mathcal{D}_{KL} (q_{\theta} (z \mid x) \parallel p(z \mid x) ) = - \mathbb{E}_{q_{\theta} (z \mid x)} \left[ \log p(x \mid z) \right] + \mathcal{D}_{KL} \left[ q_{\theta} (z \mid x) \parallel p(z) \right] + \log p(x)\]

So, minimizing \(\mathcal{D}_{KL} (q_{\theta} (z \mid x) \parallel p(z \mid x) )\) is equivalent to maximizing the ELBO: \(\mathbb{E}_{q_{\theta} (z \mid x)} \left[ \log p(x \mid z) \right] - \mathcal{D}_{KL} \left[ q_{\theta} (z \mid x) \parallel p(z) \right]\).

Another perspective on the motivation of VAE can be seen from the development of the Auto Encoder (AE) model.

2023-09-23

(#Finance) Reading My solopreneur story: zero to $$45K/mo in 2 years

This is a very inspring story about how Tony Dinh built his side project to a profitable software product and become a solopreneur. Some key takeaways:

(#F4T) How to get rich (without getting lucky) by Naval Ravikant. Some I like with my current knowledge and experience:

(#F4T) Less is More principle

2023-09-22

(#Research) On Reading: FLOW MATCHING FOR GENERATIVE MODELING (ICLR 2023)

2023-09-17

(#Research) Parameter-Efficient Fine-Tuning. Link to Youtube video

(#Research) Tree of Thoughts: Deliberate Problem Solving with Large Language Models. Link to Yannic’s video

2023-09-14

(#Research) On Reading: Diffusion Models Beat GANs on Image Synthesis. Link to blog post

2023-09-08

(#Idea) Erasing a concept from a generative model by given a set of images.

2023-09-04

(#Research) Chasing General Intelligence by Dr. Rui Shu (OpenAI) - Guest lecture at Monash FIT 3181 - Deep Learning Unit.

Disclaimer: Rui gave a great talk and I want to take notes on it. However, because of detaching from the context, this note not necessarily reflects Rui’s opinions.

Part 1: Rui’ Research Journey:

Part 2: What is going wrong?

Style-content disentanglement (Image from Rui's talk

Part 3:

Yang Song’s paper on estimating gradient of data distribution to learn generative model

Thought on Diffusion Model (Image from Rui's talk)

Rui’s thought after his lesson on Diffusion Model:

Where are we now?

GPT-4 Drawing (Image from Rui's talk)
The last step before AGI (Image from Rui's talk)

How to get there?

The inferefence heavy future?

Final Advice

(#Research) Connection between Latent Diffusion formulation and the Rate-Distortion theory (Trung’s idea). Below are some personal notes for memorization without leaking the idea.

2023-09-01

(#Research) On reading: TRADING INFORMATION BETWEEN LATENTS IN HIERARCHICAL VARIATIONAL AUTOENCODERS published on ICLR 2023.

Revisit Rate-Distortion trade-off theory:

Rate distortion theory?

\[H - D \leq I(z,x) \leq R\]

where \(H\) is the entropy of data \(x\) and \(D\) is the distortion of the reconstruction \(x\) from \(z\). \(R\) is the rate of the latent code \(z\) (e.g., compression rate).

\(R = \log \frac{e(z \mid x)}{m(z)}\) where \(e(z \mid x)\) is the encoder and \(m(z)\) is the prior distribution of \(z\). The higher the rate, the more information of \(x\) is preserved in \(z\). However, if the rate is high, it lessen the generalization ability of the \(\log p(x \mid z, \theta)\).

The mutual information has upper bound by the rate of the latent code \(z\). For example, if \(R=0\) then \(I(z,x)=0\). This is because \(e(z \mid x) = m(z)\), which means that the encoder cannot learn anything from the data \(x\).

Motivation of the paper:

Standard hierarchical VAEs:

Standard hierarchical VAEs

Generalized Hierarchical VAEs:

Generalized Hierarchical VAEs

2023-08-30

(#Research) NVIDIA event: Transforming Your Development and Business with Large Language Models

Introduction, Demystifying LLM and Data Curation

LLM Training and Inference at Scale. Customized LLM with Prompt-Learning

Questions:

Nvidia Framework:

2023-08-26

(#F4T) Review 10 best ideas/concepts from Charlie Munger. Link to the blog post: https://tuananhbui89.github.io/blog/2023/f4t/

2023-08-25

(#Research) Data-Free Knowledge Distillation

Where \(T(\hat{x})\) is the teacher model and \(S(\hat{x})\) is the student model. \(\hat{x}\) is the synthetic data generated by generator \(G\).

\[L_G = L_{CE} (T(\hat{x}), y) - L_{KL} (T(\hat{x}), S(\hat{x}))\]

Where \(y\) is the label of the synthetic data. Minimizing first term encourages the generator generate data that fall into the target class \(y\), while maximizing the second term encourages the generator generate diverse data? Compared to GAN, we can think both teacher and student models are acted as discriminators.

This adversarial game need to intergrate to the training process in each iteration. For example, after each iteration, you need to minimizing \(L_G\) to generate a new synthetic data. And then using \(\hat{x}\) to train the student. This is to ensure that the synthetic data is new to the student model. Therefore, one of the drawbacks of DFKD is that it is very slow.

2023-08-21

(#Research) On reading: Decoupled Kullback-Leibler Divergence Loss.

Decoupled Kullback-Leibler (DKL) (reference).

2023-08-20

(#Research) On reading: Classifier-Free Diffusion Guidance.

(#Idea) Mixup Class-Guidance Diffusion model.

Confliction of gradients (reference).

2023-08-19

(#Coding) How to show an image in Github page. Reference to this post: https://tuananhbui89.github.io/blog/2023/learn-code/

2023-08-18

(#Research) On Reading: DDGR: Continual Learning with Deep Diffusion-based Generative Replay

(#Idea) We can use Class-Guidance Diffusion model to learn mixup data and then can use that model to generate not only data from pure classes but also from mixup classes. It is well accepted that mixup technique can improve the generalization of classifier, so it can be applied to CL setting as well.

2023-08-17

(#Code) How to disable NSFW detection in Huggingface.

# line 426 in the pipeline_stable_diffusion.py
def run_safety_checker(self, image, device, dtype):
    return image, None

    # The following original code will be ignored
    if self.safety_checker is None:
        has_nsfw_concept = None
    else:
        if torch.is_tensor(image):
            feature_extractor_input = self.image_processor.postprocess(image, output_type="pil")
        else:
            feature_extractor_input = self.image_processor.numpy_to_pil(image)
        safety_checker_input = self.feature_extractor(feature_extractor_input, return_tensors="pt").to(device)
        image, has_nsfw_concept = self.safety_checker(
            images=image, clip_input=safety_checker_input.pixel_values.to(dtype)
        )
    return image, has_nsfw_concept

(#Idea, #GenAI, #TML) Completely erase a concept (i.e., NSFW) from latent space of Stable Diffusion.

2023-08-16

(#Research) The Inaproppriate Image Prompts (I2P) benchmark.

2023-08-14

(#Research) Some trends in KDD 2023: Graph Neural Networks and Casual Inference from Industrial Applications.

(#Research) Graph Neural Networks, definition of neighborhood aggregation. Most of GNN methods work on million of nodes, to scale to billion of nodes, there are a lot of tricks under the hood (from Dinh’s working experience in Trustingsocial).

(#Research) (With Trung and Van Anh) We derive a nice framework that connect data-space distributional robustness (as in our ICLR 2022 paper) and model-space distributional robustness (as in SAM).

2023-08-08

(#Research) On reading: Erasing Concepts from Diffusion Models (ICCV 2023). https://erasing.baulab.info/

(#Research) On reading: CIRCUMVENTING CONCEPT ERASURE METHODS FOR TEXT-TO-IMAGE GENERATIVE MODELS. Project page: https://nyu-dice-lab.github.io/CCE/

2023-08-06

(#Coding) Strange bug in generating adversarial examples using Huggingface.

2023-08-05

(#Coding) Understand the implementation of the Anti-Dreambooth project. Ref to the blog post

2023-08-04

(#Research) Three views of Diffusion Models:

2023-08-03

(#Research) Trusted Autonomous Systems

2023-08-01

(#Research) Helmholtz Visiting Researcher Grant

2023-07-31

(#Research) Australia Research Council (ARC) Discovery Project (DP) 2023.

2023-07-30

() First Home Buyer Super Saver Scheme.

2023-07-27

(#GenAI) How to run textual inversion using Huggingface library locally without login to Huggingface with token. Including:

2023-07-24

(#Productivity) How to present a slide and take notes on the same screen simultaneously (e.g., it is very useful when teaching or giving a talk). At Monash, the lecture theatre has MirrorOp installed on all screens that can connect wirelessly with a laptop but it is not convenient when we want to take notes.

2023-07-23

Micromouse competition.

Reference:

2023-07-22

(#Parenting) ATAR and university admission.

(#AML) Rethinking Backdoor Attacks, Mardy’s group. https://arxiv.org/pdf/2307.10163.pdf

(#Productivity) How to synchonize PowerPoint in Teams among multiple editers working simultaneously (i.e., share file in Teams, and open file in local using PowerPoint), in this way can retain the math equations.

If you have some math equations in your PowerPoint, and open it on Google Slide, the equations will be converted to images. If you accidently sync the file with your original file, all math equations will be lost.

2023-07-21

(#F4T) You cannot solve a problem with the same thinking that created it. Albert Einstein.

Context: Our lab had a workshop last week and Dinh gave a talk about his favorite book “The 7 habits of highly effective people”. One of the habits is “Sharpen the saw” means that you always need to improve yourself from all aspects: physical, mental, spiritual, and social. That is the way you can overcome your limits and obstacles that you are facing.

(#Experience) The first department metting as a new Research Fellow.

Context: I have just started my new position as a RF at the Department of Data Science and AI, Monash University. Today is the first time to be exposed to what really happen beyond student’s perspective.