copy and paste this google map to your website or blog!
Press copy button and paste into your blog or website.
(Please switch to 'HTML' mode when posting into your blog. Examples: WordPress Example, Blogger Example)
Reconstructive Visual Instruction Tuning - OpenReview This paper introduces reconstructive visual instruction tuning (ROSS), a family of Large Multimodal Models (LMMs) that exploit vision-centric supervision signals In contrast to conventional visual
MoRA: High-Rank Updating for Parameter-Efficient Fine-Tuning The authors motivate their approach by a clever memorization task They show that vanilla LoRA with low rank (e g rank 8) does poorly on a contrived memorization task They train on 10k unique IDs for 100 epochs where the LLM has to memorize the value associated with a key This is a nice way to motivate the rest of the results The authors do rigorous benchmarking across training approaches
MIND over Body: Adaptive Thinking using Dynamic Computation Clever use of intermediate activations to assess input complexity Should be able to work with existing architectures making engineering it for downstream real-world use cases simpler
Self-Taught Evaluators | OpenReview The method utilises clever prompting to generate pairs of completions for an instruction where one is known to be preferred to the other Judgements with chain-of-thought reasoning are then generated from the target LLM, with correct ones being SFT trained on in an iterative procedure
xLSTM: Extended Long Short-Term Memory | OpenReview This paper proposes the so-called extended long short-term memory (xLSTM) model to include numerous enhanced features such as exponential gating and modified memory structures in the traditional LSTM framework These enhanced features can effectively mitigate some of the well-known limitations of LSTM (e g poor at revising storage decision, limited storage capacity and lack of
Instruct2Act: Mapping Multi-modality Instructions to Robotic Arm. . . Foundation models have significantly advanced in various applications, including text-to-image generation, open-vocabulary segmentation, and natural language processing This paper presents Instruct2Act, a framework that leverages Large Language Models (LLMs) to convert multi-modal instructions to sequential actions for robotic manipulation tasks Specifically, Instruct2Act uses LLMs to
Inner Information Analysis Algorithm for Deep Neural Network based. . . Deep learning has achieved advancements across a variety of forefront fields However, its inherent 'black box' characteristic poses challenges to the comprehension and trustworthiness of the decision-making processes within neural networks To mitigate these challenges, we introduce InnerSightNet, an inner information analysis algorithm designed to illuminate the inner workings of deep neural
Cross-Embodiment Dexterous Grasping with Reinforcement Learning Dexterous hands exhibit significant potential for complex real-world grasping tasks While recent studies have primarily focused on learning policies for specific robotic hands, the development of a universal policy that controls diverse dexterous hands remains largely unexplored In this work, we study the learning of cross-embodiment dexterous grasping policies using reinforcement learning
Counterfactual Debiasing for Fact Verification 579 In this paper, we have proposed a novel counter- factual framework CLEVER for debiasing fact- checking models Unlike existing works, CLEVER is augmentation-free and mitigates biases on infer- ence stage In CLEVER, the claim-evidence fusion model and the claim-only model are independently trained to capture the corresponding information