The problem is simple: consumer motherboards don’t have that many PCIe slots, and consumer CPUs don’t have enough lanes to run 3+ GPUs at full PCIe gen 3 or gen 4 speeds.

My idea was to buy 3-4 computers for cheap, slot a GPU into each of them and use 4 of them in tandem. I imagine this will require some sort of agent running on each node which will be connected through a 10Gbe network. I can get a 10Gbe network running for this project.

Does Ollama or any other local AI project support this? Getting a server motherboard with CPU is going to get expensive very quickly, but this would be a great alternative.

Thanks

  • BombOmOm@lemmy.world
    link
    fedilink
    English
    arrow-up
    4
    ·
    edit-2
    10 days ago

    Basically no GPU needs a full PCIe x16 slot to run at full speed. There are motherboards out there which will give you 3 or 4 slots of PCIe x8 electrical (x16 physical). I would look into those.

    Edit: If you are willing to buy a board that supports AMD Epyc processors, you can get boards with basically as many PCIe slots as you could ever hope for. But that is almost certainly overkill for this task.

    • marauding_gibberish142@lemmy.dbzer0.comOP
      link
      fedilink
      English
      arrow-up
      2
      arrow-down
      1
      ·
      10 days ago

      Aren’t Epyc boards really expensive? I was going to buy 3-4 used computers and stuff a GPU in each.

      Are there motherboards on the used market that can run the E5-2600 V4 series CPUs and have multiple PCIe Xi slots? The only ones I found were super expensive/esoteric.

      • reptar@lemmy.world
        link
        fedilink
        English
        arrow-up
        1
        ·
        edit-2
        10 days ago

        Hey I built a micro -atx epyc for work that has tons of pcie slots. Pretty sure it was an ASRock (or ASRack). I can find the details tomorrow if you’d like. Just let me know!

        E: well, it looks like I remembered wrong and it was an atx, not micro. I think it is ASRock Rack ROMED8-2T and it has 7 PCIe4.0 x16 (I needed a lot). Unfortunately I don’t think it’s sold anymore other than really high prices on eBay.

        • marauding_gibberish142@lemmy.dbzer0.comOP
          link
          fedilink
          English
          arrow-up
          3
          ·
          10 days ago

          Thank you, and that highlights the problem - I don’t see any affordable options (around $200 or so for a motherboard + CPU combo) for a lot of PCIe lanes other than purchasing Frankenstein boards from Aliexpress. Which isn’t going to be a thing for much longer with tariffs, so I’m looking elsewhere

      • just_another_person@lemmy.world
        link
        fedilink
        English
        arrow-up
        1
        arrow-down
        5
        ·
        10 days ago

        Wow, so you want to use inefficient models super cheap. I guarantee nobody has ever thought of this before. Good move coming to Lemmy for tips on how to do so. I bet you’re the next Sam Altman 🤣

        • marauding_gibberish142@lemmy.dbzer0.comOP
          link
          fedilink
          English
          arrow-up
          2
          arrow-down
          1
          ·
          10 days ago

          I don’t understand your point, but I was going to use 4 GPUs (something like used 3090s when they get cheaper or the Arc B580s) to run the smaller models like Mistral small.

  • False@lemmy.world
    link
    fedilink
    English
    arrow-up
    3
    ·
    10 days ago

    You’re entering the realm of enterprise AI horizontal scaling which is $$$$

      • just_another_person@lemmy.world
        link
        fedilink
        English
        arrow-up
        1
        arrow-down
        1
        ·
        edit-2
        10 days ago

        I assume you’re talking about a CUDA implementation here. There’s ways to do this with that system, and even sub-projects that expand on that. I’m mostly pointing how pointless it is for you to do this. What a waste of time and money.

        Edit: others are also pointing this out, but I’m still being downvoted. Mkay.

        • marauding_gibberish142@lemmy.dbzer0.comOP
          link
          fedilink
          English
          arrow-up
          2
          arrow-down
          1
          ·
          10 days ago

          Used 3090s go for $800. I was planning to wait for the ARC B580s to go down in price to buy a few. The reason for the networked setup is because I didn’t find there to be enough PCIe lanes in any of the used computers I was looking at. If there’s either an affordable card with good performance and 48GB of VRAM, or there’s an affordable motherboard + CPU combo with a lot of PCIe lanes under $200, then I’ll gladly drop the idea of the distributed AI. I just need lots of VRAM and this is the only way I could think of.

          Thanks