World News - Find latest world news and headlines today based on politics, crime, entertainment, sports, lifestyle, technology and many
Tuesday, 31 December 2024
Iran confirms arrest of Italian journalist Cecilia Sala
from Yahoo News - Latest News & Headlines https://ift.tt/waqG0dj
Social Security January payment schedule: Here’s when beneficiaries will get their checks
from Yahoo News - Latest News & Headlines https://ift.tt/qEC71Yg
Meet the rich retired boomers who are now ultra-frugal because they are scared of going broke—even after saving for decades
from Yahoo News - Latest News & Headlines https://ift.tt/IMRKZ0P
Monday, 30 December 2024
North Korea's Kim lauds new fish farm, calls for regional development
from Yahoo News - Latest News & Headlines https://ift.tt/TrC206w
India rupee may dip, bond yields to track US peers
from Yahoo News - Latest News & Headlines https://ift.tt/5hExO6L
New top story on Hacker News: Tell HN: John Friel my father, internet pioneer and creator of QModem, has died
Tell HN: John Friel my father, internet pioneer and creator of QModem, has died
62 by AaronFriel | 2 comments on Hacker News.
If you knew him in life or remember his contribution to the world, please share your stories.
62 by AaronFriel | 2 comments on Hacker News.
If you knew him in life or remember his contribution to the world, please share your stories.
Sunday, 29 December 2024
Crazy Colorado storm system expected to bring extreme avalanche danger, high fire danger
from Yahoo News - Latest News & Headlines https://ift.tt/paYvwlI
Saturday, 28 December 2024
Check Your $2 Bills — They Could Be Worth a Ton
from Yahoo News - Latest News & Headlines https://ift.tt/y5PQtDc
Finland boards oil tanker suspected of causing internet, power cable outages
from Yahoo News - Latest News & Headlines https://ift.tt/UXhfljP
Vivek Ramaswamy Dragged After Wild Rant on How American Workers Suck
from Yahoo News - Latest News & Headlines https://ift.tt/SOHyU1j
Putin apologizes for Azerbaijan airlines plane crash but stops short of taking responsibility
from Yahoo News - Latest News & Headlines https://ift.tt/lV9GQNK
Friday, 27 December 2024
Thursday, 26 December 2024
Wednesday, 25 December 2024
Trump aims dig at Obama in bizarre hour-long Christmas Day Truth Social posting spree
from Yahoo News - Latest News & Headlines https://ift.tt/bO3vmaG
New top story on Hacker News: The Swedish cabin on the frontline of a possible hybrid war
The Swedish cabin on the frontline of a possible hybrid war
19 by Sami_Lehtinen | 2 comments on Hacker News.
19 by Sami_Lehtinen | 2 comments on Hacker News.
Tuesday, 24 December 2024
Monday, 23 December 2024
Sunday, 22 December 2024
Trump: Panama will lose control of Canal if it continues to ‘rip-off’ US
from Yahoo News - Latest News & Headlines https://ift.tt/49TZ3Se
Should You Buy Quantum Computing Stocks in 2025?
from Yahoo News - Latest News & Headlines https://ift.tt/D9Cp7oq
Saturday, 21 December 2024
Jugging: Chandler PD raising awareness of crime targeting shoppers
from Yahoo News - Latest News & Headlines https://ift.tt/gnd2Qp0
Friday, 20 December 2024
Russian Space Program Confirms Plans to Destroy Space Station
from Yahoo News - Latest News & Headlines https://ift.tt/zcRxZwF
Thieves Trick Dealership Out Of Mercedes
from Yahoo News - Latest News & Headlines https://ift.tt/cES9HiK
Thursday, 19 December 2024
I’m a Financial Planner: Always Buy These 9 Things in January
from Yahoo News - Latest News & Headlines https://ift.tt/n6ojAfS
Wednesday, 18 December 2024
Ship dubbed ‘floating megabomb’ dumps toxic fertiliser in North Sea
from Yahoo News - Latest News & Headlines https://ift.tt/YijZ2HU
Tuesday, 17 December 2024
Monday, 16 December 2024
New top story on Hacker News: Show HN: NCompass Technologies – yet another AI Inference API, but hear us out
Show HN: NCompass Technologies – yet another AI Inference API, but hear us out
3 by adiraja | 5 comments on Hacker News.
Hello HackerNews! I’m excited to share what we’ve been working on at nCompass Technologies: an AI inference platform that gives you a scalable and reliable API to access any open-source AI model — with no rate limits. We don't have rate limits as optimizations we made to our AI model serving software enable us to support a high number of concurrent requests without degrading quality of service for you as a user. If you’re thinking, well aren’t there a bunch of these already? So were we when we started nCompass. When using other APIs, we found that they weren’t reliable enough to be able to use open source models in production environments. To resolve this, we're building an AI inference engine that enable you, as an end user, to reliably use open source models in production. Underlying this API, we’re building optimizations at the hosting, scheduling and kernel levels with the single goal of minimizing the number of GPUs required to maximize the number of concurrent requests you can serve, without degrading quality of service. We’re still building a lot of our optimizations, but we’ve released what we have so far via our API. Compared to vLLM, we currently keep time-to-first-token (TTFT) 2-4x lower than vLLM at the equivalent concurrent request rate. You can check out a demo of our API here: https://ift.tt/fPrksIQ As a result of the optimizations we’ve rolled out so far, we’re releasing a few unique features on our API: 1. Rate-Limits: we don’t have any Most other API’s out there have strict rate limits and can be rather unreliable. We don’t want API’s for open source models to remain as a solution for prototypes only. We want people to use these APIs like they do OpenAI’s or Anthropic’s and actually make production grade products on top of open source models. 2. Underserved models: we have them There are a ton of models out there, but not all of them are readily available for people to use if they don’t have access to GPUs. We envision our API becoming a system where anyone can launch any custom model of their choice with minimal cold starts and run the model as a simple API call. Our cold starts for any 8B or 70B model are only 40s and we’ll keep improving this. Towards this goal, we already have models like `ai4bharat/hercule-hi` hosted on our API to support non-english language use cases and models like `Qwen/QwQ-32B-Preview` to support reasoning based use cases. You can find the other models that we host here: https://ift.tt/fMYyvbd. We’d love for you to try out our API by following the steps here: https://ift.tt/UNKCion . We provide $100 of free credit on sign up to run models, and like we said, go crazy with your requests, we’d love to see if you can break our system :) We’re still actively building out features and optimizations and your input can help shape the future of nCompass. If you have thoughts on our platform or want us to host a specific model, let us know at hello@ncompass.tech. Happy Hacking!
3 by adiraja | 5 comments on Hacker News.
Hello HackerNews! I’m excited to share what we’ve been working on at nCompass Technologies: an AI inference platform that gives you a scalable and reliable API to access any open-source AI model — with no rate limits. We don't have rate limits as optimizations we made to our AI model serving software enable us to support a high number of concurrent requests without degrading quality of service for you as a user. If you’re thinking, well aren’t there a bunch of these already? So were we when we started nCompass. When using other APIs, we found that they weren’t reliable enough to be able to use open source models in production environments. To resolve this, we're building an AI inference engine that enable you, as an end user, to reliably use open source models in production. Underlying this API, we’re building optimizations at the hosting, scheduling and kernel levels with the single goal of minimizing the number of GPUs required to maximize the number of concurrent requests you can serve, without degrading quality of service. We’re still building a lot of our optimizations, but we’ve released what we have so far via our API. Compared to vLLM, we currently keep time-to-first-token (TTFT) 2-4x lower than vLLM at the equivalent concurrent request rate. You can check out a demo of our API here: https://ift.tt/fPrksIQ As a result of the optimizations we’ve rolled out so far, we’re releasing a few unique features on our API: 1. Rate-Limits: we don’t have any Most other API’s out there have strict rate limits and can be rather unreliable. We don’t want API’s for open source models to remain as a solution for prototypes only. We want people to use these APIs like they do OpenAI’s or Anthropic’s and actually make production grade products on top of open source models. 2. Underserved models: we have them There are a ton of models out there, but not all of them are readily available for people to use if they don’t have access to GPUs. We envision our API becoming a system where anyone can launch any custom model of their choice with minimal cold starts and run the model as a simple API call. Our cold starts for any 8B or 70B model are only 40s and we’ll keep improving this. Towards this goal, we already have models like `ai4bharat/hercule-hi` hosted on our API to support non-english language use cases and models like `Qwen/QwQ-32B-Preview` to support reasoning based use cases. You can find the other models that we host here: https://ift.tt/fMYyvbd. We’d love for you to try out our API by following the steps here: https://ift.tt/UNKCion . We provide $100 of free credit on sign up to run models, and like we said, go crazy with your requests, we’d love to see if you can break our system :) We’re still actively building out features and optimizations and your input can help shape the future of nCompass. If you have thoughts on our platform or want us to host a specific model, let us know at hello@ncompass.tech. Happy Hacking!
Sunday, 15 December 2024
Saturday, 14 December 2024
Friday, 13 December 2024
New top story on Hacker News: Show HN: I made the slowest, most expensive GPT
Show HN: I made the slowest, most expensive GPT
23 by wluk | 13 comments on Hacker News.
This is another one of my automate-my-life projects - I'm constantly asking the same question to different AIs since there's always the hope of getting a better answer somewhere else. Maybe ChatGPT's answer is too short, so I ask Perplexity. But I realize that's hallucinated, so I try Gemini. That answer sounds right, but I cross-reference with Claude just to make sure. This doesn't really apply to math/coding (where o1 or Gemini can probably one-shot an excellent response), but more to online search, where information is more fluid and there's no "right" search engine + text restructuring + model combination every time. Even o1 doesn't have online search, so it's obviously a hard problem to solve. An example is something like "best ski resorts in the US", which will get a different response from every GPT, but most of their rankings won't reflect actual skiers' consensus - say, on Reddit https://ift.tt/6CzA0R4... - because there's so many opinions floating around, a one-shot RAG search + LLM isn't going to have enough context to find how everyone thinks. And obviously, offline GPTs like o1 and Sonnet/Haiku aren't going to have the latest updates if a resort closes for example. So I’ve spent the last few months experimenting with a new project that's basically the most expensive GPT I’ll ever run. It runs search queries through ChatGPT, Claude, Grok, Perplexity, Gemini, etc., then aggregates the responses. For added financial tragedy, in-between it also uses multiple embedding models and performs iterative RAG searches through different search engines. This all functions as sort of like one giant AI brain. So I pay for every search, then every embedding, then every intermediary LLM input/output, then the final LLM input/output. On average it costs about 10 to 30 cents per search. It's also extremely slow. https://ithy.com I know that sounds absurdly overkill, but that’s kind of the point. The goal is to get the most accurate and comprehensive answer possible, because it's been vetted by a bunch of different AIs, each sourcing from different buckets of websites. Context limits today are just large enough that this type of search and cross-model iteration is possible, where we can determine the "overlap" between a diverse set of text to determine some sort of consensus. The idea is to get online answers that aren't attainable from any single AI. If you end up trying this out, I'd recommend comparing Ithy's output against the other GPTs to see the difference. It's going to cost me a fortune to run this project (I'll probably keep it online for a month or two), but I see it as an exploration of what’s possible with today’s model APIs, rather than something that’s immediately practical. Think of it as an online o1 (without the $200/month price tag, though I'm offering a $29/month Pro plan to help subsidize). If nothing else, it’s a fun (and pricey) thought experiment.
23 by wluk | 13 comments on Hacker News.
This is another one of my automate-my-life projects - I'm constantly asking the same question to different AIs since there's always the hope of getting a better answer somewhere else. Maybe ChatGPT's answer is too short, so I ask Perplexity. But I realize that's hallucinated, so I try Gemini. That answer sounds right, but I cross-reference with Claude just to make sure. This doesn't really apply to math/coding (where o1 or Gemini can probably one-shot an excellent response), but more to online search, where information is more fluid and there's no "right" search engine + text restructuring + model combination every time. Even o1 doesn't have online search, so it's obviously a hard problem to solve. An example is something like "best ski resorts in the US", which will get a different response from every GPT, but most of their rankings won't reflect actual skiers' consensus - say, on Reddit https://ift.tt/6CzA0R4... - because there's so many opinions floating around, a one-shot RAG search + LLM isn't going to have enough context to find how everyone thinks. And obviously, offline GPTs like o1 and Sonnet/Haiku aren't going to have the latest updates if a resort closes for example. So I’ve spent the last few months experimenting with a new project that's basically the most expensive GPT I’ll ever run. It runs search queries through ChatGPT, Claude, Grok, Perplexity, Gemini, etc., then aggregates the responses. For added financial tragedy, in-between it also uses multiple embedding models and performs iterative RAG searches through different search engines. This all functions as sort of like one giant AI brain. So I pay for every search, then every embedding, then every intermediary LLM input/output, then the final LLM input/output. On average it costs about 10 to 30 cents per search. It's also extremely slow. https://ithy.com I know that sounds absurdly overkill, but that’s kind of the point. The goal is to get the most accurate and comprehensive answer possible, because it's been vetted by a bunch of different AIs, each sourcing from different buckets of websites. Context limits today are just large enough that this type of search and cross-model iteration is possible, where we can determine the "overlap" between a diverse set of text to determine some sort of consensus. The idea is to get online answers that aren't attainable from any single AI. If you end up trying this out, I'd recommend comparing Ithy's output against the other GPTs to see the difference. It's going to cost me a fortune to run this project (I'll probably keep it online for a month or two), but I see it as an exploration of what’s possible with today’s model APIs, rather than something that’s immediately practical. Think of it as an online o1 (without the $200/month price tag, though I'm offering a $29/month Pro plan to help subsidize). If nothing else, it’s a fun (and pricey) thought experiment.
Thursday, 12 December 2024
New top story on Hacker News: Show HN: Gentrace – connect to your LLM app code and run/eval it from a UI
Show HN: Gentrace – connect to your LLM app code and run/eval it from a UI
7 by dsaffy | 0 comments on Hacker News.
Hey HN - Doug from Gentrace here. We originally launched via Show HN in August of 2023 as evaluation and observability for generative AI: https://ift.tt/KX6bsUO Since then, everyone from the model providers to LLM ops companies built a prompt playground. We had one too, until we realized this was totally the wrong approach: - It's not connected to your application code - They don't support all models - You have to rebuild evals for just this one prompt (can't use your end-to-end evals) In other words, it was a ton of work and time to use these to actually make your app better. So, we built a new experience and are relaunching around this idea: Gentrace is a collaborative LLM app testing and experimentation platform that brings together engineers, PMs, subject matter experts, and more to run and test your actual end-to-end app. To do this, use our SDK to: - connect your app to Gentrace as a live runner over websocket (local) / via webhook (staging, prod) - wrap your parameters (eg prompt, model, top-k) so they become tunable knobs in the front end - edit the parameters and then run / evaluate the actual app code with datasets and evals in Gentrace We think it's great for tuning retrieval systems, upgrading models, and iterating on prompts. It's free to trial. Would love to hear your feedback / what you think!
7 by dsaffy | 0 comments on Hacker News.
Hey HN - Doug from Gentrace here. We originally launched via Show HN in August of 2023 as evaluation and observability for generative AI: https://ift.tt/KX6bsUO Since then, everyone from the model providers to LLM ops companies built a prompt playground. We had one too, until we realized this was totally the wrong approach: - It's not connected to your application code - They don't support all models - You have to rebuild evals for just this one prompt (can't use your end-to-end evals) In other words, it was a ton of work and time to use these to actually make your app better. So, we built a new experience and are relaunching around this idea: Gentrace is a collaborative LLM app testing and experimentation platform that brings together engineers, PMs, subject matter experts, and more to run and test your actual end-to-end app. To do this, use our SDK to: - connect your app to Gentrace as a live runner over websocket (local) / via webhook (staging, prod) - wrap your parameters (eg prompt, model, top-k) so they become tunable knobs in the front end - edit the parameters and then run / evaluate the actual app code with datasets and evals in Gentrace We think it's great for tuning retrieval systems, upgrading models, and iterating on prompts. It's free to trial. Would love to hear your feedback / what you think!
Wednesday, 11 December 2024
Tuesday, 10 December 2024
New top story on Hacker News: Ask HN: Those making $500/month on side projects in 2024 – Show and tell
Ask HN: Those making $500/month on side projects in 2024 – Show and tell
87 by cvbox | 72 comments on Hacker News.
It's the time of the year again, so I'd be interested hear what new (and old) ideas have come up. Previously asked on: 2023 → https://ift.tt/3mUX6ZK 2022 → https://ift.tt/ehPHQCk 2021 → https://ift.tt/ExoqimD 2020 → https://ift.tt/vRdE92T 2019 → https://ift.tt/XnJdAc1 2018 → https://ift.tt/ABfve6h 2017 → https://ift.tt/gqDshEa
87 by cvbox | 72 comments on Hacker News.
It's the time of the year again, so I'd be interested hear what new (and old) ideas have come up. Previously asked on: 2023 → https://ift.tt/3mUX6ZK 2022 → https://ift.tt/ehPHQCk 2021 → https://ift.tt/ExoqimD 2020 → https://ift.tt/vRdE92T 2019 → https://ift.tt/XnJdAc1 2018 → https://ift.tt/ABfve6h 2017 → https://ift.tt/gqDshEa
Monday, 9 December 2024
I’m a Mechanic: 9 Cars I Would Never Buy and Why They Aren’t Worth It
from Yahoo News - Latest News & Headlines https://ift.tt/QiV7lHc
Sunday, 8 December 2024
Saturday, 7 December 2024
Surgeons Are Sharing Their Wildest "Oh, Crap!" Moments From The Job, And I'm Too Stunned To Speak
from Yahoo News - Latest News & Headlines https://ift.tt/FcfpNB9
These Are The Least Satisfying Cars And SUVs You Can Buy In 2025, According To Consumer Reports
from Yahoo News - Latest News & Headlines https://ift.tt/Q2MACm9
New top story on Hacker News: Ask HN: Next best to incorporate other than Delaware?
Ask HN: Next best to incorporate other than Delaware?
9 by ksec | 7 comments on Hacker News.
I wish this was more of Ask YC / VC. I have been told it is either Nevada, Wyoming and Texas. Considering Stripe Atlas dont support anything other than Delaware. Does anyone have any Pros and Cons or recommendation.
9 by ksec | 7 comments on Hacker News.
I wish this was more of Ask YC / VC. I have been told it is either Nevada, Wyoming and Texas. Considering Stripe Atlas dont support anything other than Delaware. Does anyone have any Pros and Cons or recommendation.
Friday, 6 December 2024
Thursday, 5 December 2024
Wednesday, 4 December 2024
New top story on Hacker News: Show HN: I combined spaced repetition with emails so you can remember anything
Show HN: I combined spaced repetition with emails so you can remember anything
15 by iskrataa | 3 comments on Hacker News.
Hey HN, I am a student shipping apps in my free time. This is my 4th for the year! Non-fic books and podcasts have been part of my life for years now but I always struggled with remembering what I’ve read or listened to. I wanted it to stick even after years. My notes list grew large but I never really revisited them. That’s why I created GinkgoNotes. You can enter notes you want to recall and leave it to the app to create a personalised (based on spaced repetition) email schedule. That means you’ll get your notes emailed to you a couple of times exactly when you should read them again (based on Ebbinghaus's Forgetting Curve) so it’s certain that you’ll remember them. I hope this will be helpful as it was for me. Would love some feedback! Iskren
15 by iskrataa | 3 comments on Hacker News.
Hey HN, I am a student shipping apps in my free time. This is my 4th for the year! Non-fic books and podcasts have been part of my life for years now but I always struggled with remembering what I’ve read or listened to. I wanted it to stick even after years. My notes list grew large but I never really revisited them. That’s why I created GinkgoNotes. You can enter notes you want to recall and leave it to the app to create a personalised (based on spaced repetition) email schedule. That means you’ll get your notes emailed to you a couple of times exactly when you should read them again (based on Ebbinghaus's Forgetting Curve) so it’s certain that you’ll remember them. I hope this will be helpful as it was for me. Would love some feedback! Iskren
Tuesday, 3 December 2024
Monday, 2 December 2024
Sunday, 1 December 2024
Texas billionaire brothers plan big development south of McCall. Up first: 1,100 homes
from Yahoo News - Latest News & Headlines https://ift.tt/0V1Sna2
Subscribe to:
Posts (Atom)