[{"data":1,"prerenderedAt":181},["ShallowReactive",2],{"site-nav":3,"home-index":39,"papers-list":54,"home-team":163},{"id":4,"extension":5,"items":6,"meta":25,"stem":37,"__hash__":38},"nav\u002Fnav.md","md",[7,10,19,22],{"label":8,"to":9},"Papers","\u002Fpapers",{"label":11,"items":12},"Benchmark",[13,16],{"label":14,"to":15},"Retrieval","\u002Fretrieval",{"label":17,"to":18},"Candidate Assesment","\u002Fcandidate-assessment",{"label":20,"to":21},"Team","\u002Fteam",{"label":23,"to":24},"Careers","https:\u002F\u002Fcareers.malt.com\u002F",{"body":26},{"type":27,"value":28,"toc":33},"minimark",[29],[30,31,32],"p",{},"Navigation links for the site header. This file is loaded by the default layout and is not meant as a standalone page.",{"title":34,"searchDepth":35,"depth":35,"links":36},"",2,[],"nav","4zzqubtdeLCKskuzs3cOPP7CYzCZzJ_6gUl4De3Xamc",{"id":40,"title":41,"body":42,"description":46,"extension":5,"image":47,"meta":48,"navigation":49,"path":50,"seo":51,"stem":52,"__hash__":53},"content\u002Findex.md","Research That Powers the Future of Freelancing",{"type":27,"value":43,"toc":44},[],{"title":34,"searchDepth":35,"depth":35,"links":45},[],"Recent papers and preprints from our group : open work, reproducible methods, and clear writeups.","\u002Fimg\u002Fhero.png",{},true,"\u002F",{"title":41,"description":46},"index","p9VlwKjqZd_A1val2wBhGk1oTQtqKTArwtuhftwfT9A",[55,101,119,145],{"id":56,"title":57,"authors":58,"body":78,"date":93,"description":94,"extension":5,"image":34,"link":95,"meta":96,"navigation":49,"path":97,"seo":98,"stem":99,"summary":34,"__hash__":100},"papers\u002Fpapers\u002Fworkrb-a-community-driven-evaluation-framework-for-ai-in-the-work-domain.md","WorkRB: A Community-Driven Evaluation Framework for AI in the Work Domain",[59,60,61,62,63,64,65,66,67,68,69,70,71,72,73,74,75,76,77],"Matthias De Lange","Warre Veys","Federico Retyk","Daniel Deniz","Warren Jouanneau","Mike Zhang","Aleksander Bielinski","Emma Jouffroy","Nicole Clobes","Nina Baranowska","David Graus","Marc Palyart","Rabih Zbib","Dimitra Gkatzia","Thomas Demeester","Tijl De Bie","Toine Bogers","Jens-Joris Decorte","Jeroen Van Hautte",{"type":27,"value":79,"toc":91},[80],[30,81,82,83,90],{},"Today's evolving labor markets rely increasingly on recommender systems for hiring, talent management, and workforce analytics, with natural language processing (NLP) capabilities at the core. Yet, research in this area remains highly fragmented. Studies employ divergent ontologies (ESCO, O*NET, national taxonomies), heterogeneous task formulations, and diverse model families, making cross-study comparison and reproducibility exceedingly difficult. General-purpose benchmarks lack coverage of work-specific tasks, and the inherent sensitivity of employment data further limits open evaluation. We present \\textbf{WorkRB} (Work Research Benchmark), the first open-source, community-driven benchmark tailored to work-domain AI. WorkRB organizes 13 diverse tasks from 7 task groups as unified recommendation and NLP tasks, including job\u002Fskill recommendation, candidate recommendation, similar item recommendation, and skill extraction and normalization. WorkRB enables both monolingual and cross-lingual evaluation settings through dynamic loading of multilingual ontologies. Developed within a multi-stakeholder ecosystem of academia, industry, and public institutions, WorkRB has a modular design for seamless contributions and enables integration of proprietary tasks without disclosing sensitive data. WorkRB is available under the Apache 2.0 license at ",[84,85,89],"a",{"href":86,"rel":87},"https:\u002F\u002Fgithub.com\u002Ftechwolf-ai\u002FWorkRB",[88],"nofollow","this https URL",".",{"title":34,"searchDepth":35,"depth":35,"links":92},[],"2026-03-17","Today's evolving labor markets rely increasingly on recommender systems for hiring, talent management, and workforce analytics, with natural language processing (NLP) capabilities at the core. Yet, research in this area remains highly fragmented. Studies employ divergent ontologies (ESCO, O*NET, national taxonomies), heterogeneous task formulations, and diverse model families, making cross-study comparison and reproducibility exceedingly difficult. General-purpose benchmarks lack coverage of work-specific tasks, and the inherent sensitivity of employment data further limits open evaluation. We present \\textbf{WorkRB} (Work Research Benchmark), the first open-source, community-driven benchmark tailored to work-domain AI. WorkRB organizes 13 diverse tasks from 7 task groups as unified recommendation and NLP tasks, including job\u002Fskill recommendation, candidate recommendation, similar item recommendation, and skill extraction and normalization. WorkRB enables both monolingual and cross-lingual evaluation settings through dynamic loading of multilingual ontologies. Developed within a multi-stakeholder ecosystem of academia, industry, and public institutions, WorkRB has a modular design for seamless contributions and enables integration of proprietary tasks without disclosing sensitive data. WorkRB is available under the Apache 2.0 license at this https URL.","https:\u002F\u002Farxiv.org\u002Fabs\u002F2604.13055",{},"\u002Fpapers\u002Fworkrb-a-community-driven-evaluation-framework-for-ai-in-the-work-domain",{"title":57,"description":94},"papers\u002Fworkrb-a-community-driven-evaluation-framework-for-ai-in-the-work-domain","6lW2n_-yMXk7Mfeg89TXUNMhKcgGEVJeZI-qp7py2QU",{"id":102,"title":103,"authors":104,"body":105,"date":112,"description":109,"extension":5,"image":34,"link":113,"meta":114,"navigation":49,"path":115,"seo":116,"stem":117,"summary":34,"__hash__":118},"papers\u002Fpapers\u002Fan-efficient-long-context-ranking-architecture-with-calibrated-llm-distillation-application-to-person-job-fit.md","An Efficient Long-Context Ranking Architecture With Calibrated LLM Distillation: Application to Person-Job Fit",[63,66,70],{"type":27,"value":106,"toc":110},[107],[30,108,109],{},"Finding the most relevant person for a job proposal in real time is challenging, especially when resumes are long, structured, and multilingual. In this paper, we propose a re-ranking model based on a new generation of late cross-attention architecture, that decomposes both resumes and project briefs to efficiently handle long-context inputs with minimal computational overhead. To mitigate historical data biases, we use a generative large language model (LLM) as a teacher, generating fine-grained, semantically grounded supervision. This signal is distilled into our student model via an enriched distillation loss function. The resulting model produces skill-fit scores that enable consistent and interpretable person-job matching. Experiments on relevance, ranking, and calibration metrics demonstrate that our approach outperforms state-of-the-art baselines.",{"title":34,"searchDepth":35,"depth":35,"links":111},[],"2026-01-16","https:\u002F\u002Farxiv.org\u002Fabs\u002F2601.10321",{},"\u002Fpapers\u002Fan-efficient-long-context-ranking-architecture-with-calibrated-llm-distillation-application-to-person-job-fit",{"title":103,"description":109},"papers\u002Fan-efficient-long-context-ranking-architecture-with-calibrated-llm-distillation-application-to-person-job-fit","yVxbqZBWMW5P3fpRpykGgVyFPmeLzy2cGOSDi5lETM0",{"id":120,"title":121,"authors":122,"body":125,"date":112,"description":129,"extension":5,"image":34,"link":139,"meta":140,"navigation":49,"path":141,"seo":142,"stem":143,"summary":34,"__hash__":144},"papers\u002Fpapers\u002Fevaluating-llm-behavior-in-hiring-implicit-weights-fairness-across-groups-and-alignment-with-human-preferences.md","Evaluating LLM Behavior in Hiring: Implicit Weights, Fairness Across Groups, and Alignment with Human Preferences",[123,66,63,70,124],"Morgane Hoffmann","Charles Pebereau",{"type":27,"value":126,"toc":137},[127,130],[30,128,129],{},"General-purpose Large Language Models (LLMs) show significant potential in recruitment applications, where decisions require reasoning over unstructured text, balancing multiple criteria, and inferring fit and competence from indirect productivity signals. Yet, it is still uncertain how LLMs assign importance to each attribute and whether such assignments are in line with economic principles, recruiter preferences or broader societal norms. We propose a framework to evaluate an LLM’s decision logic in recruitment, by drawing on established economic methodologies for analyzing human hiring behavior. We build synthetic datasets from real freelancer profiles and project descriptions from a major European online freelance marketplace and apply a full factorial design to estimate how a LLM weighs different match-relevant criteria when evaluating freelancer-project fit. We identify which attributes the LLM prioritizes and analyze how these weights vary across project contexts and demographic subgroups. Finally, we explain how a comparable experimental setup could be implemented with human recruiters to assess alignment between model and human decisions. Our findings reveal that the LLM weighs core productivity signals, such as skills and experience, but interprets certain features beyond their explicit matching value. While showing minimal average discrimination against minority groups, intersectional effects reveal that productivity signals carry different weights between demographic groups.",[30,131,132,136],{},[133,134,135],"strong",{},"keywords",": Large Language Models | Person-job Fit | Fairness | Interpretability",{"title":34,"searchDepth":35,"depth":35,"links":138},[],"https:\u002F\u002Farxiv.org\u002Fhtml\u002F2601.11379v1",{},"\u002Fpapers\u002Fevaluating-llm-behavior-in-hiring-implicit-weights-fairness-across-groups-and-alignment-with-human-preferences",{"title":121,"description":129},"papers\u002Fevaluating-llm-behavior-in-hiring-implicit-weights-fairness-across-groups-and-alignment-with-human-preferences","YLkIXjcGwXGIaHdky-w54IoUachiSuhUK8vVXVxYObs",{"id":146,"title":147,"authors":148,"body":149,"date":156,"description":153,"extension":5,"image":34,"link":157,"meta":158,"navigation":49,"path":159,"seo":160,"stem":161,"summary":34,"__hash__":162},"papers\u002Fpapers\u002Fskill-matching-at-scale-freelancer-project-alignment-for-efficient-multilingual-candidate-retrieval.md","Skill matching at scale: freelancer-project alignment for efficient multilingual candidate retrieval",[63,70,66],{"type":27,"value":150,"toc":154},[151],[30,152,153],{},"Finding the perfect match between a job proposal and a set of freelancers is not an easy task to perform at scale, especially in multiple languages. In this paper, we propose a novel neural retriever architecture that tackles this problem in a multilingual setting. Our method encodes project descriptions and freelancer profiles by leveraging pre-trained multilingual language models. The latter are used as backbone for a custom transformer architecture that aims to keep the structure of the profiles and project. This model is trained with a contrastive loss on historical data. Thanks to several experiments, we show that this approach effectively captures skill matching similarity and facilitates efficient matching, outperforming traditional methods.",{"title":34,"searchDepth":35,"depth":35,"links":155},[],"2024-09-19","https:\u002F\u002Farxiv.org\u002Fabs\u002F2409.12097",{},"\u002Fpapers\u002Fskill-matching-at-scale-freelancer-project-alignment-for-efficient-multilingual-candidate-retrieval",{"title":147,"description":153},"papers\u002Fskill-matching-at-scale-freelancer-project-alignment-for-efficient-multilingual-candidate-retrieval","9oppAGFI0bkDz3gWODCu2X2i-aPeRnSNbxhtwk0zu4c",{"members":164},[165,169,173,177],{"name":166,"role":167,"picture":168,"bio":34},"Warren Jouanneau, PhD","Lead Machine Learning Researcher","\u002Fimg\u002Fwarren-jouanneau.png",{"name":170,"role":171,"picture":172,"bio":34},"Emma Jouffroy, PhD","Machine Learning Researcher","\u002Fimg\u002Femma-jouffroy.png",{"name":174,"role":175,"picture":176,"bio":34},"Morgane Hoffmann, PhD","Data Scientist","\u002Fimg\u002Fmorgane-hoffmann.png",{"name":178,"role":179,"picture":180,"bio":34},"Marc Palyart, PhD","Director of Machine Learning","\u002Fimg\u002Fmarc-palyart.png",1776860002775]