Collection size

96 articles

Page 1 of 5

May 13, 2026 · Artificial Intelligence

Multimodal RAG: A Complete Guide to Ingesting Images, Tables, and PDFs

This article examines the blind spot of pure‑text RAG for visual content, compares three multimodal ingestion strategies—CLIP embeddings, image‑to‑text captioning with a MultiVectorRetriever, and ColPali visual retrieval—covers table‑specific handling, presents end‑to‑end TypeScript implementations, and lists common pitfalls to avoid when deploying production‑grade multimodal RAG pipelines.

CLIPColPaliImage Captioning

0 likes · 22 min read

Multimodal RAG: A Complete Guide to Ingesting Images, Tables, and PDFs

JavaEdge

Apr 2, 2026 · Artificial Intelligence

Unlocking Qwen3.6-Plus: Features, Multimodal Performance, and API Guide

This article provides an in‑depth overview of the Qwen3.6‑Plus model, detailing its million‑token context window, enhanced multimodal reasoning, benchmark results across language and vision tasks, and step‑by‑step instructions for using the official API and integrating the model with popular coding assistants.

Qwen3.6-Plusapi-integrationcode agents

0 likes · 12 min read

Unlocking Qwen3.6-Plus: Features, Multimodal Performance, and API Guide

Model Perspective

Mar 8, 2026 · Artificial Intelligence

Top 10 Must-Have OpenClaw Skills and How to Install Them Safely

This article introduces OpenClaw, an open‑source AI assistant framework, explains its skill system, lists the ten most popular community skills with installation commands, and highlights security considerations for safely extending its capabilities.

0 likes · 10 min read

Top 10 Must-Have OpenClaw Skills and How to Install Them Safely

ByteFE

Oct 11, 2023 · Artificial Intelligence

CR Copilot: An Open‑Source LLM‑Based Code Review Assistant with Private Knowledge Base

This article describes the design and implementation of a code‑review assistant powered by open‑source large language models and a privately hosted knowledge base, covering background, pain points, system architecture, model selection, vector‑store integration, prompt engineering, diff parsing, and practical reflections.

AICode ReviewKnowledge Base

0 likes · 24 min read

CR Copilot: An Open‑Source LLM‑Based Code Review Assistant with Private Knowledge Base

James' Growth Diary

May 14, 2026 · Artificial Intelligence

LLM Semantic Routing Explained: Model‑Based Intent Classification and Three Keyword‑Matching Pitfalls

This article breaks down LLM semantic routing as a classifier, compares keyword, embedding, and LLM‑based routes, provides full TypeScript implementations, introduces hybrid routing for speed and accuracy, and covers production‑grade observability and dynamic configuration to avoid common pitfalls.

Hybrid RoutingLLMLangChain

0 likes · 33 min read

LLM Semantic Routing Explained: Model‑Based Intent Classification and Three Keyword‑Matching Pitfalls

CodeTrend

Apr 29, 2026 · Industry Insights

Top GitHub Repos of April 29 2026: AI Agents, Voice AI, Finance Apps and More

The CodeTrend daily briefing spotlights the most starred GitHub projects of April 29 2026, covering today’s top picks, the week’s hottest repositories and the month’s biggest surges across AI agents, developer tools, finance applications, hacking utilities and more.

AICDeveloper Tools

0 likes · 50 min read

Top GitHub Repos of April 29 2026: AI Agents, Voice AI, Finance Apps and More

Ops Development Stories

Jul 29, 2025 · Artificial Intelligence

Master AI Agents with LangGraph: Build Adaptive RAG, Translation, and ReAct Agents

This comprehensive guide explains what an AI Agent is, its core capabilities and design patterns, and walks through step‑by‑step implementations of RAG, Translation, and ReAct agents using LangGraph, complete with code samples, workflow diagrams, and practical tips for building personal ops knowledge‑base agents.

LLMLangGraphRAG

0 likes · 64 min read

Master AI Agents with LangGraph: Build Adaptive RAG, Translation, and ReAct Agents

Baobao Algorithm Notes

Jan 27, 2026 · Artificial Intelligence

Putting Kimi K2.5 and Kimi Code to the Test: Real‑World AI Agent Benchmarks

This article presents a hands‑on evaluation of Kimi K2.5 and its open‑source Kimi Code agent across a series of hard‑core prompts, covering Python API generation, cost‑optimized routing, multimodal ECharts visualisation, massive‑scale SQL optimisation, web‑search‑driven research, MoE explanation and video‑to‑code workflows.

AI agentKimiLarge Language Model

0 likes · 9 min read

Putting Kimi K2.5 and Kimi Code to the Test: Real‑World AI Agent Benchmarks

Architect's Alchemy Furnace

Jul 7, 2025 · Artificial Intelligence

How to Build an AI‑Powered Knowledge Retrieval Workflow with Dify and Embedding Models

This guide explains how to organize a knowledge base, select embedding and rerank models, clean user queries, and construct a Dify DSL workflow that iteratively retrieves and merges information before handing it to a large language model for answer generation.

AI workflowDifyKnowledge retrieval

0 likes · 23 min read

How to Build an AI‑Powered Knowledge Retrieval Workflow with Dify and Embedding Models

Ops Development Stories

Sep 19, 2024 · Artificial Intelligence

How to Connect Qwen LLMs with Higress AI Gateway: A Hands‑On Guide

This tutorial walks through setting up a local k3d cluster, installing Higress, and using its AI plugins—including AI Proxy, AI JSON formatter, AI Agent, and AI Statistics—to integrate and observe Alibaba Cloud's Qwen large language models across various use cases such as weather and flight queries.

AI gatewayAI pluginsHigress

0 likes · 30 min read

How to Connect Qwen LLMs with Higress AI Gateway: A Hands‑On Guide

Geek Labs

Mar 31, 2026 · Artificial Intelligence

5 Open‑Source AI Projects: Lark CLI, OpenSpace, G0DM0D3, Awesome‑AI List, and Meta TribeV2

The article presents five notable open‑source AI projects, outlining their features, use cases, and performance: Lark CLI for office automation, OpenSpace with self‑evolving agents (4.2× gain, 46% token saving), G0DM0D3 as a privacy‑focused multi‑model chat alternative, a curated truly‑open AI list, and Meta’s TribeV2 multimodal brain‑encoding model for neuroscience research.

AI agentsG0DM0D3Meta TribeV2

0 likes · 12 min read

5 Open‑Source AI Projects: Lark CLI, OpenSpace, G0DM0D3, Awesome‑AI List, and Meta TribeV2

Java Tech Enthusiast

Mar 5, 2024 · Artificial Intelligence

Claude 3 vs GPT‑4: A Deep Dive into the New AI Giant’s Multimodal Edge

Claude 3 has arrived, outperforming GPT‑4 across benchmark scores, offering free Sonnet and paid Opus tiers, and showcasing unprecedented multimodal, long‑context, and code‑generation abilities that reshape competitive dynamics in large‑language‑model research.

AnthropicClaude 3GPT-4 comparison

0 likes · 12 min read

Claude 3 vs GPT‑4: A Deep Dive into the New AI Giant’s Multimodal Edge

Machine Learning Algorithms & Natural Language Processing

Apr 22, 2026 · Artificial Intelligence

Hands‑On Kimi K2.6 + Hermes: A Karpathy‑Style Step‑by‑Step Guide

This article presents a detailed, hands‑on tutorial for deploying Kimi K2.6 with Hermes and Obsidian, showcases multi‑modal video note‑taking, skill creation, self‑evolving LLM‑driven knowledge bases, large‑scale agent clusters, and discusses both the strengths and current limitations of the system.

Agent SystemsHermesKimi K2.6

0 likes · 10 min read

Hands‑On Kimi K2.6 + Hermes: A Karpathy‑Style Step‑by‑Step Guide

Full-Stack Cultivation Path

Sep 4, 2024 · Artificial Intelligence

Hot Open-Source RAG Tool for Document Chat: GraphRAG, Multimodal QA & Complex Reasoning

This article introduces Kotaemon, an open‑source Retrieval‑Augmented Generation platform that lets users chat with their documents, offering a self‑hosted web UI, support for local and API LLMs, hybrid retrieval, multimodal question answering, GraphRAG indexing, and advanced reasoning capabilities, along with step‑by‑step installation via App or Docker.

GraphRAGLLMMultimodal QA

0 likes · 6 min read

Hot Open-Source RAG Tool for Document Chat: GraphRAG, Multimodal QA & Complex Reasoning

SuanNi

May 30, 2026 · Artificial Intelligence

Step 3.7 Flash: High‑Efficiency Pro‑Level Agent Model with 400 TPS and Low Cost

Step 3.7 Flash is a 196B‑parameter, 11B‑activation multimodal agent model that delivers 400 TPS inference, superior code‑generation and cross‑framework stability, cost‑effective Advisor Mode, and strong vision and search performance, with extensive benchmark gains over its predecessor and competing models.

AI agentAdvisor ModeMultimodal

0 likes · 12 min read

Step 3.7 Flash: High‑Efficiency Pro‑Level Agent Model with 400 TPS and Low Cost

Kuaishou Tech

Aug 23, 2025 · Artificial Intelligence

How Thyme Enables Models to Think Beyond Images with Code‑Driven Multimodal Reasoning

The Kwai Keye team presents Thyme, a novel multimodal reasoning framework that lets large language models generate and safely execute Python code for image manipulation and complex calculations, achieving significant performance gains over existing vision‑language models across perception, reasoning, and hallucination‑reduction benchmarks.

AI researchLarge Language ModelMultimodal

0 likes · 12 min read

How Thyme Enables Models to Think Beyond Images with Code‑Driven Multimodal Reasoning

Tech Minimalism

Jan 28, 2026 · Artificial Intelligence

Master Oh My Claude Code: Complete Guide to Multi‑Agent AI Coding with Claude

Oh My Claude Code transforms Claude Code into a multi‑agent orchestration platform, offering five execution modes, 32 specialized agents, automatic model routing, and simple installation, enabling developers to automate complex coding tasks from planning to testing with natural‑language commands.

AI codingClaude CodeOh My Claude Code

0 likes · 15 min read

Master Oh My Claude Code: Complete Guide to Multi‑Agent AI Coding with Claude

Old Zhang's AI Learning

Jan 23, 2026 · Artificial Intelligence

Top AI Projects: Mobile‑Controlled Claude Code, Hackathon‑Winner Claude Config, and AI‑Powered Document Illustration

The article introduces four standout AI tooling projects—Happy Coder for mobile Claude Code access, the Everything Claude Code configuration suite from a hackathon champion, the Document Illustrator skill that auto‑generates illustrations, and the free Gemini CLI course—plus the skills.sh marketplace, detailing their features, installation steps, and practical evaluations.

Agent SkillsClaude CodeDocument Illustrator

0 likes · 12 min read

Top AI Projects: Mobile‑Controlled Claude Code, Hackathon‑Winner Claude Config, and AI‑Powered Document Illustration

Old Meng AI Explorer

Apr 17, 2026 · Artificial Intelligence

Which AI Coding CLI Reigns Supreme? A Data‑Driven Comparison of Claude, Codex, Copilot, and Gemini

This article presents a hands‑on, data‑backed comparison of the four leading AI programming command‑line tools—Claude Code, GitHub Copilot CLI, Google Gemini CLI, and OpenAI Codex—covering installation ease, command design, agent capabilities, extensibility, multimodal support, security, pricing, real‑world scenarios, and benchmark results to help developers choose the tool that best fits their specific workflow.

AI codingCLI toolscomparison

0 likes · 18 min read

Which AI Coding CLI Reigns Supreme? A Data‑Driven Comparison of Claude, Codex, Copilot, and Gemini

Old Zhang's AI Learning

May 9, 2026 · Artificial Intelligence

Why Gemini’s Multimodal RAG with File Search Is So Compelling

The article analyzes Google Gemini’s File Search tool as a fully managed multimodal RAG solution, detailing its architecture, key features, pricing model, step‑by‑step usage, strengths, limitations, and how it compares with OpenAI Assistants File Search and Vertex AI Search.

AI RetrievalEmbeddingFile Search

0 likes · 14 min read