Tag

AI models

1 views collected around this technical thread.

Java Architecture Diary
Java Architecture Diary
May 19, 2025 · Artificial Intelligence

How Ollama 0.7 Unlocks Local Multimodal AI with One Command

Ollama 0.7 introduces a fully re‑engineered core that brings seamless multimodal model support, lists top visual models, showcases OCR and image analysis capabilities, explains technical breakthroughs, and provides a quick three‑step guide to deploy powerful local AI vision.

AI EngineeringAI modelsImage Recognition
0 likes · 7 min read
How Ollama 0.7 Unlocks Local Multimodal AI with One Command
DataFunTalk
DataFunTalk
Apr 10, 2025 · Artificial Intelligence

Google Cloud Next 25: Comprehensive Overview of New AI Models, Tools, and Protocols

Google Cloud Next 25 unveiled a wealth of AI advancements, including five new generative models, a groundbreaking Agent‑to‑Agent protocol, upgraded AI‑powered developer tools, expanded AI applications across Workspace, and the high‑performance Ironwood TPU for inference, offering developers a clear view of the latest AI landscape.

AI modelsAgent protocolGemini
0 likes · 14 min read
Google Cloud Next 25: Comprehensive Overview of New AI Models, Tools, and Protocols
DataFunTalk
DataFunTalk
Mar 21, 2025 · Artificial Intelligence

OpenAI Unveils New STT and TTS Models: gpt-4o-transcribe, gpt-4o-mini-transcribe, and gpt-4o-mini-tts – Performance, Pricing, and Demo

OpenAI announced three new speech models—two STT models (gpt-4o-transcribe and its lightweight gpt-4o-mini-transcribe) and one TTS model (gpt-4o-mini-tts)—showcasing strong accuracy on multilingual benchmarks, competitive pricing, and a quick‑start API demo for developers.

AI modelsGPT-4oOpenAI
0 likes · 8 min read
OpenAI Unveils New STT and TTS Models: gpt-4o-transcribe, gpt-4o-mini-transcribe, and gpt-4o-mini-tts – Performance, Pricing, and Demo
Nightwalker Tech
Nightwalker Tech
Feb 17, 2025 · Artificial Intelligence

Comparative Analysis of Programming Capabilities of DeepSeek v3, Gemini Flash 2.0, and Claude 3.5 Sonnet

This article compares three leading AI programming assistants—DeepSeek v3, Gemini Flash 2.0, and Claude 3.5 Sonnet—examining their characteristics, coding abilities, debugging features, supported languages, and optimal use cases to help readers select the most suitable model for their specific development or data‑analysis needs.

AI modelscode generationmodel comparison
0 likes · 7 min read
Comparative Analysis of Programming Capabilities of DeepSeek v3, Gemini Flash 2.0, and Claude 3.5 Sonnet
DevOps
DevOps
Jan 7, 2025 · Artificial Intelligence

Microsoft’s 2025 AI Predictions: Stronger Models, AI Agents, AI Companions, Efficient Resources, Testing & Customization, and Accelerated Scientific Research

Microsoft outlines six 2025 AI forecasts—including more powerful models, autonomous AI agents reshaping work, AI companions aiding daily life, greener resource use, rigorous testing and customization, and AI-driven scientific breakthroughs—highlighting how these advances will transform industries, research, and everyday experiences.

2025 predictionsAIAI agents
0 likes · 8 min read
Microsoft’s 2025 AI Predictions: Stronger Models, AI Agents, AI Companions, Efficient Resources, Testing & Customization, and Accelerated Scientific Research
DataFunSummit
DataFunSummit
Sep 17, 2024 · Artificial Intelligence

Multimodal Video Understanding for Real-World Surveillance: Tasks, Dataset, Models, and Challenges

This article presents a comprehensive overview of multimodal video understanding for real-world surveillance, covering task definitions, the new UCA multimodal surveillance dataset, baseline models for video moment localization, captioning, and anomaly detection, experimental results, challenges, and future research directions.

AI modelsLarge Language Modelsmultimodal video understanding
0 likes · 19 min read
Multimodal Video Understanding for Real-World Surveillance: Tasks, Dataset, Models, and Challenges
DataFunSummit
DataFunSummit
Sep 16, 2024 · Artificial Intelligence

Multimodal Content Understanding and Cold-Start Practices in NetEase Cloud Music Community Recommendation System

This article details how NetEase Cloud Music leverages multimodal content understanding—using audio models like MusicCLIP and Audio MAE and image‑text fusion via FLAVA—to improve recommendation performance for new content and new users, covering system architecture, cold‑start solutions, and future AI‑driven directions.

AI modelsCold Startaudio representation
0 likes · 15 min read
Multimodal Content Understanding and Cold-Start Practices in NetEase Cloud Music Community Recommendation System
Bilibili Tech
Bilibili Tech
Sep 6, 2024 · Artificial Intelligence

AI Empowering Software Development for Quality and Efficiency

The QECon Global Software Quality and Efficiency Conference in Shanghai on September 20‑21 will explore how AI—especially AIGC, LLMs, and large models—enhances software development, testing, and performance, featuring expert talks on multi‑device quality assurance and practical test‑shift strategies, highlighting innovative opportunities and real‑world value.

AI in software developmentAI modelsConference
0 likes · 2 min read
AI Empowering Software Development for Quality and Efficiency
DataFunTalk
DataFunTalk
Jun 11, 2024 · Artificial Intelligence

Guide to Fine‑Tuning OpenAI Models for Improved Performance

This guide explains how to fine‑tune OpenAI’s pre‑trained models, covering data preparation, environment setup, API usage, code examples, hyper‑parameter tuning, monitoring, and best practices to achieve better performance with less data and compute resources.

AI modelsAPIFine-tuning
0 likes · 16 min read
Guide to Fine‑Tuning OpenAI Models for Improved Performance
ZhongAn Tech Team
ZhongAn Tech Team
Feb 19, 2024 · Artificial Intelligence

Weekly Tech Digest: AI Breakthroughs, Hardware Shifts, and Industry Insights

This weekly technology digest highlights major industry developments, including Huawei's smartphone market resurgence, Google's internal AI coding assistant, Nvidia's accelerated GPU delivery timelines, and expert perspectives on OpenAI's Sora video generation model, alongside significant funding initiatives for AI semiconductor manufacturing.

AI modelsArtificial IntelligenceGPU Supply Chain
0 likes · 8 min read
Weekly Tech Digest: AI Breakthroughs, Hardware Shifts, and Industry Insights
Rare Earth Juejin Tech Community
Rare Earth Juejin Tech Community
Aug 30, 2023 · Artificial Intelligence

AudioCraft: An Open‑Source PyTorch Library for Audio Generation with MusicGen, AudioGen, and EnCodec

AudioCraft is a PyTorch library that bundles state‑of‑the‑art AI models—MusicGen, AudioGen, and the EnCodec codec—to generate high‑quality audio from text or reference sounds, and the article explains its architecture, evaluation results, and how to install and run it.

AI modelsAudio GenerationAudioGen
0 likes · 9 min read
AudioCraft: An Open‑Source PyTorch Library for Audio Generation with MusicGen, AudioGen, and EnCodec
DataFunSummit
DataFunSummit
Apr 13, 2023 · Artificial Intelligence

ModelScope CV Model Overview: Visual Detection and Keypoint Applications

This article presents a comprehensive overview of ModelScope's computer‑vision models, detailing visual detection and keypoint solutions—including VitDet, YOLOX, res2net, HRNet, and 3D pose models—their architectures, performance highlights, real‑world applications, and future development plans.

AI modelsModelScopecomputer vision
0 likes · 11 min read
ModelScope CV Model Overview: Visual Detection and Keypoint Applications
Tencent Advertising Technology
Tencent Advertising Technology
Mar 10, 2023 · Artificial Intelligence

Optimizing Large-Scale Model Training with Tencent's AngelPTM and ZeRO-Cache

This article presents Tencent's latest advancements in large‑scale model training, detailing the AngelPTM framework and its ZeRO‑Cache optimization techniques that reduce memory and storage costs, improve hardware utilization, and achieve high‑performance training for trillion‑parameter AI models across various applications.

AI modelsAngelPTMLarge-Scale Training
0 likes · 14 min read
Optimizing Large-Scale Model Training with Tencent's AngelPTM and ZeRO-Cache
DataFunTalk
DataFunTalk
Nov 28, 2021 · Artificial Intelligence

Fine‑Grained Content Understanding and Operation in QQ Music: Optimizing the Recommendation System

This article presents QQ Music’s end‑to‑end solution for data‑driven content understanding, value evaluation, and fine‑grained operation, detailing offline and real‑time pipelines, neural‑network models, a content middle‑platform, parameter services, and a precise delivery system that boost user engagement while preserving experience.

AI modelscontent understandingdata-driven operation
0 likes · 24 min read
Fine‑Grained Content Understanding and Operation in QQ Music: Optimizing the Recommendation System
DataFunTalk
DataFunTalk
Jun 1, 2020 · Artificial Intelligence

Emotion Analysis Techniques in Alibaba's Intelligent Customer Service System

This article presents a comprehensive overview of emotion analysis technologies employed in Alibaba's intelligent customer service platform, detailing models for user emotion detection, emotional response generation, service quality inspection, satisfaction prediction, and intelligent human‑agent handoff, along with experimental results and future research directions.

AI modelsIntelligent Customer ServiceNatural Language Processing
0 likes · 40 min read
Emotion Analysis Techniques in Alibaba's Intelligent Customer Service System