key points

  • another anchor-free point based object detection network
  • introduce new loss, varifocal loss which is a forked version from focal loss. Makes some changes from focal loss to compensate positive/negative imbalance futher.
  • instead of prediction classification and IOU score separately, this work predicts a single scalar which represents a combination of these two. The authors say that this approach shows better results when doing NMS postprocessing.
  • star-shaped box feature representation.


This work applies new ideas on top of FCOS + ATSS. Highly recommend reading on both network/methods. …

Dataset and DataLoader is the basic shipped method of preparing and feeding data when training models in pytorch. The official docs does a great job on showing how these two interact to provide an easier, cleaner way to feed data.

But even after following through this great tutorial, I still wasn’t sure how exactly DataLoader gathered the data returned in Dataset into a batch data.

The Dataset doesn’t restrict the user on how the data should be returned. It can return one object or multiple objects. But how does the DataLoader know how to bundle multiple return object/objects?

This is…

this post is a summary of key points that I found important and added some of my own comments. For more detail and parts that I haven’t introduced in this post, please refer to the paper.


key points

  • arbitrary style transfer in real time
  • use adaptive instance normalization(AdaIN) layers which aligns the mean and variance of content features
  • allows to control content-style trade-off, style interpolation, color/spatial control

previous works

  • optimization approach using backprop of network to minimize style loss and content loss. …


key points

  • finding optimal design space, instead of single model architecture
  • through experimentation with these design spaces, authors found a few practices that generally give better performance

What is “design space”?

design space is defined by model building parameters which have its own range, and therefore defines the range of possible model structures.

Why chase design space rather than a singular design?

by chasing design spaces instead of individual networks, we can discover general design principles that work across general settings.

How to evaluate design space?

The quality of a design space can be measured by evaluating the network architectures sampled from the design space, and evaluating the sampled architectures. …

arxiv link:

This paper introduces resnext architecture, which is built mainly upon resnet. Naturally, it gives insight on how resnext differs and excels from resnet.

key point

  • compared to resnet, the residual blocks are upgraded to have multiple “paths” or as the paper puts it “cardinality” which can be treated as another model architecture design hyper parameter.
  • resnext architectures that have sufficient cardinality shows improved performance
  • tldr: use improved residual blocks compared to resnet

Different residual block

the key difference in resnext architecture is that it uses different residual block structure compared to resnet. This difference is well depicted in the following figure.


objective: make nautilus . command work in a headless server.

install nautilus

$ sudo apt install nautilus

I’ve tried installing a display manager, but instead I succeeded with directly starting the x server with xinit. related link

install xinit

$ sudo apt install xinit

create new xorg.conf

if it is a headless server, then the default xorg confs(located at /usr/share/X11/xorg.conf.d) is not going to work because it doesn't have a real physical screen.

To be specific, I’ve tried launching the X server with lightdm.

$ sudo systemctl start lightdm

However, this does not work. We can see why it doesn’t work inside the xorg log file located…

When creating a training script in tensorflow, there rises the need to sometimes add summary items(summary protobufs to be exact) later on in the same step.

For example, lets say a training session is in play with a metric calculation step included. Periodically, I want to run a prediction with a validation/test data and record the metrics for these predictions along with the summary writer used to log the process of the training steps. In other words, a tensorboard image like the following is desired:

a tensorboard screen capture where loss/metric tab items are logged for every training step and the test tab items are only logged occasionally.

In the above capture, loss and metric is recorded for every training step. On the…


I have been studying Yolov2 for a while and have first tried using it on car detection in actual road situations. I used tiny-yolo as the base model and used the pre-trained binary weights. While it recognized cars very well with traditional full-shot car images like the ones that a person can see in a commercial, it did not work well in car images that a driver would see in the driver's seat.

Preparing Dataset

Get Images from Blackbox Video Footage

Clearly, the pretrained model was not trained with driver’s POV car images. In order to gather some data, I took the liberty of copying the blackbox videos…


Deep Learning Engineer LinkedIn:

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store