Home

Can we simply apply existing off-policy methods to leverage offline data when learning online, without offline RL pre-training or explicit imitation terms that privilege the prior offline data? The primary objective of the authors is to answer this question. However, to do this, the authors had to solve three main problems. Expensive expert ...

In this post, I want to explore RECAP(RL with Experience and Corrections via Advantage-conditioned Policies) which incorporates advantage estimation with imitation learning like actor-critic method in RL. In RECAP algorithm, advantage of actions are calculated through value network and feed this information into VLM backbone as improvement indic...

One of the biggest surprises that I encountered while majoring in applied mathematics was the statement that “the cardinal numbers of $\mathbb{N}$ (the set of natural numbers) and $\mathbb{Z}$ (the set of integers) are equal”. The cardinal number of a set is defined as the size of the set. For finite sets, if a set $A$ is empty then the cardinal...

If you don’t know about ROS 2 Topics, go to this page and learn. Topics are used for data streams (unidirectional), and Services are used for a client/server interactions (bidirectional). First , Services can work in a synchronous or asynchronous manner. If the service is synchronous, the client sends a Request and blocks until receiving a res...

A Topic is a receiver of a signal from a publisher (node). The publisher is able to send data to the topic while not knowing which subscribers(nodes) receive this data. Similarly, subscribers do not know which nodes send the data to the topic. On top of that, Nodes’ capability of sending data is not restricted to sending to single topic but send...

Nodes are subprograms in an application, responsible for only one thing. Nodes communicate with each other through topics, services, and parameters. Like OOP, nodes reduce code complexity, and provide low fault tolerance. Even further, nodes can be written in many different programming languages including Python and C++. Nodes should have a sing...

Run nodes ros2 run <package name> <node name> NOTE: “-h” option shows arguments and options like below ros2 -h ros2 run -h ros2 node -h Checking running nodes ros2 node list Check running nodes ros2 node info <node name> WARNING: It is not encouraged to run two nodes with identical names. These could run at the same t...

Intro Monte Carlo Tree Search (MCTS) works well in practice but poses theoretical challenges. In this writing, I want to describe MCTS algorithm, and why this algorithm works. Open-loop planning algorithms like MCTS, can plan future actions from an initial state $s_0$. They assume access to a model of the environment, either stochastically or ...

9 post articles, 2 pages.

Jaehyun Jeong

RLPD: Reinforcement Learning with Prior Data

Decoding RECAP: A Theoretical Look at $π^{*}_{0.6}$'s Reinforcement Learning Approach

Cantor’s Diagonal Argument: Not All Infinities Are Equal

Basic Guide to build and run ROS 2 Services (Python & C++)

Basic Guide to build and run ROS 2 Topics (Python & C++)

Basic Guide to build and run ROS 2 Nodes (Python & C++)

Basic Commands for ROS 2

Monte Carlo Tree Search