Tagged articles
9 articles
Page 1 of 1
Machine Heart
Machine Heart
May 13, 2026 · Artificial Intelligence

Super‑Charging MiniCPM‑V 4.6 on One RTX 4090: 1B‑Parameter Multimodal Model Sets New Efficiency Bar

MiniCPM‑V 4.6, a 1.3 B‑parameter multimodal LLM, outperforms larger rivals such as Qwen3.5‑0.8B and Gemma 4 on both accuracy and speed, thanks to early ViT token compression and 4×/16× visual token reduction, delivering sub‑100 ms latency and over 2.6 k token/s throughput on a single RTX 4090 while also running offline on mobile devices.

Edge AIMiniCPM-VMultimodal LLM
0 likes · 16 min read
Super‑Charging MiniCPM‑V 4.6 on One RTX 4090: 1B‑Parameter Multimodal Model Sets New Efficiency Bar
AI Explorer
AI Explorer
Apr 16, 2026 · Artificial Intelligence

Claude Opus 4.7: How Anthropic’s New Model Makes AI Programming Autonomous

Anthropic’s Claude Opus 4.7, released on April 16, 2026, boosts visual resolution threefold, adds self‑verifying programming ability, delivers strong benchmark gains across code review, data analysis, legal and financial tasks, and introduces new inference tiers and security controls, reshaping AI‑assisted software development.

AI programmingAnthropicClaude Opus 4.7
0 likes · 11 min read
Claude Opus 4.7: How Anthropic’s New Model Makes AI Programming Autonomous
AI Insight Log
AI Insight Log
Mar 14, 2026 · Artificial Intelligence

Opus 4.6 Unlocks Full 1M‑Token Context—GPT‑5.4 Slumps to 36% Accuracy

Anthropic opened its million‑token context window for Claude Opus 4.6, showing a 78.3% MRCR v2 accuracy while competing models like GPT‑5.4 and Gemini 3.1 Pro fall below 40%, and the release also removes pricing premiums, expands media limits six‑fold, and requires no code changes, dramatically improving Claude Code workflows.

AI PerformanceAnthropicClaude Opus
0 likes · 8 min read
Opus 4.6 Unlocks Full 1M‑Token Context—GPT‑5.4 Slumps to 36% Accuracy
AntTech
AntTech
Dec 6, 2025 · Artificial Intelligence

FinEval‑KR: Diagnosing Knowledge vs. Reasoning Gaps in Financial Large Language Models

FinEval‑KR, a new EMNLP2025 evaluation framework co‑authored by Shanghai University of Finance and Economics and Ant Group, separates knowledge coverage from logical reasoning to reveal why financial LLMs often hallucinate on calculation tasks, introduces KS, RS, and CS metrics, and ranks 18 state‑of‑the‑art models on a rigorously curated finance dataset.

Knowledge vs reasoningLLM evaluationfinance AI
0 likes · 14 min read
FinEval‑KR: Diagnosing Knowledge vs. Reasoning Gaps in Financial Large Language Models
Fun with Large Models
Fun with Large Models
Aug 19, 2025 · Artificial Intelligence

Deep Dive into OpenAI’s GPT‑OSS and GPT‑5: Features, Performance, and Controversies

The article provides a detailed analysis of OpenAI’s newly released open‑source GPT‑OSS models (20B and 120B) and the closed‑source GPT‑5 family, covering their architectures, training pipelines, benchmark results, practical usage observations, pricing, and the mixed user feedback that surrounds GPT‑5.

GPT-5GPT-OSSOpenAI
0 likes · 13 min read
Deep Dive into OpenAI’s GPT‑OSS and GPT‑5: Features, Performance, and Controversies
Fighter's World
Fighter's World
Nov 1, 2024 · Artificial Intelligence

How Fiercely Competitive Is the Large‑Model Landscape? Insights from the State of AI Report 2024

The State of AI Report 2024 reveals converging capabilities among open and closed LLMs, a shift toward inference compute, benchmark and data contamination challenges, rising synthetic‑data risks, booming robotics research, Nvidia's hardware dominance, and a mix of accurate and missed predictions for the coming year.

AI IndustryAI hardwareLarge Language Models
0 likes · 15 min read
How Fiercely Competitive Is the Large‑Model Landscape? Insights from the State of AI Report 2024
Baobao Algorithm Notes
Baobao Algorithm Notes
Apr 11, 2022 · Artificial Intelligence

Can ResNet Still Beat Transformers? A Deep Dive into Modern Training Tricks

This article reviews recent research and official PyTorch blog updates that modify ResNet architectures and training tricks, compares their performance against EfficientNet, ConvNeXt, and Vision Transformers using extensive ImageNet benchmarks, and provides both literature‑based and local evaluation results to assess whether classic CNNs remain competitive.

CNNResNetVision Transformer
0 likes · 13 min read
Can ResNet Still Beat Transformers? A Deep Dive into Modern Training Tricks