All
Search
Images
Videos
Shorts
Maps
News
More
Shopping
Flights
Travel
Notebook
Report an inappropriate content
Please select one of the options below.
Not Relevant
Offensive
Adult
Child Sexual Abuse
Length
All
Short (less than 5 minutes)
Medium (5-20 minutes)
Long (more than 20 minutes)
Date
All
Past 24 hours
Past week
Past month
Past year
Resolution
All
Lower than 360p
360p or higher
480p or higher
720p or higher
1080p or higher
Source
All
Dailymotion
Vimeo
Metacafe
Hulu
VEVO
Myspace
MTV
CBS
Fox
CNN
MSN
Price
All
Free
Paid
Clear filters
SafeSearch:
Moderate
Strict
Moderate (default)
Off
Filter
Calculation and Design Calculate the DOF of the mechanism and... |
…
5.9K views
1 year ago
askfilo.com
3:10
大模型微调不 “失忆” 的秘密:RL 为何比 SFT 更懂 “守旧”?MIT新发现
…
990 views
1 month ago
bilibili
卢菁博士_北大AI博士后
19:23
手把手带你快速弄懂SFT、RLHF、DPO !从定义到适用边界全流程解
…
1.5K views
1 month ago
bilibili
爱学大模型的柒柒
14:19
大模型对齐方法综述与代码示例(二)
444 views
6 months ago
bilibili
swanmsg
14:19
Robotic 08_ Robot Simulation using matlab (DH parameter using Peter
…
113.1K views
Apr 21, 2017
YouTube
Dr. Amr Zamel
37:07
Circuits I: RLC Circuit Response
300.6K views
Jun 5, 2015
YouTube
The PhD Engineer
3:04
Robotic Arm Control and Task Training through Deep Reinforce
…
4.7K views
Apr 8, 2021
YouTube
IAS-Lab
14:22
[CFD] The SIMPLE Algorithm (to solve incompressible Navier-Stokes)
153.5K views
Sep 25, 2018
YouTube
Fluid Mechanics 101
28:58
Q Learning Algorithm and Agent - Reinforcement Learning p.2
113.4K views
May 31, 2019
YouTube
sentdex
38:01
CS 285: Lecture 15, Part 1: Offline Reinforcement Learning
16.3K views
Oct 16, 2021
YouTube
RAIL
1:10:05
RLHF训练法从零复现,TRL版本复现,代码实战,大语言模型训练
8.8K views
Nov 18, 2024
bilibili
蓝斯诺特
4:20
强化学习算法工程师的年度总结:RL 训练中的 Rollout、异步与框架设计
3.4K views
2 months ago
bilibili
yang_xi_111
0:52
哈工大算法大佬亲授!《大模型算法:强化学习、微调与对齐》100 张
…
139 views
9 months ago
bilibili
博文视点阿豹Class
0:56
谷歌大佬新作 RL从入门到前沿
264 views
4 months ago
bilibili
AI梨大谱
16:24
[Agentic RL] 10 分布的视角理解 LLM 的 SFT 训练和 RL 训练,Forward
…
5.6K views
1 month ago
bilibili
五道口纳什
1:37:40
如何让LLM通过RL又好又准地使用工具?
3.1K views
10 months ago
bilibili
NICE学术
7:05
一阶电路(RL),三要素法;期末不挂科,小白请教
16.1K views
Jun 19, 2023
bilibili
桐桐桐童心呀
1:14:20
【Online RL】17 OLIVE算法(Optimism Let Iterative Value-fun
…
462 views
3 months ago
bilibili
JOJO想
1:01
基于归一化抓取空间的高效区域感知6-DoF抓取算法
265 views
Oct 23, 2024
bilibili
ChenThree3
18:45
强化学习 (RL) 在做什么?RL原理讲解系列#1
7.1K views
Oct 31, 2023
bilibili
Up-Fei
35:41
【大白话03】一文理清强化学习RL基本原理 | 原理图解+公式推导
103.5K views
11 months ago
bilibili
吃花椒的麦
30:43
第2章 一阶电路暂态响应-换路定则求初始值(RC、RL、RLC电路-例题讲
…
12.5K views
Sep 29, 2021
bilibili
橙子3712
1:00:50
强化学习第一节(RL基本概念 工具 基本算法)【个人知识分享】
27.7K views
Dec 2, 2021
bilibili
二营长向强化学习开炮
0:38
RL 算法大突破!多智能体协作性能飞升
217 views
10 months ago
bilibili
AI因斯坦玩转AI
16:01
[RLHF] 从 PPO rlhf 到 DPO,公式推导与原理分析
22.2K views
Jun 23, 2024
bilibili
五道口纳什
16:42
编译原理第四章LR(0)DFA构造,判断能否使用SLR(1)分析表解决
…
616 views
2 months ago
bilibili
甜滋滋的巧克力豆
23:15
相比SFT为什么RL训练后的模型更不容易遗忘?RL的奥卡姆剃刀原理:
…
6.2K views
5 months ago
bilibili
AI论文小小编
6:23
88.RL专题:策略中随机探索怎么实现
1.7K views
10 months ago
bilibili
文言AI
1:08
豆瓣 9.4分!《大模型算法》强化学习、DPO、微调SFT、GRPO、PPO、RL
…
10.2K views
9 months ago
bilibili
叶子哥AI
7:21
106.RL专题:介绍下DPO执行的流程
2K views
9 months ago
bilibili
文言AI
See more videos
More like this
Feedback