{"id":630,"date":"2026-05-04T00:32:02","date_gmt":"2026-05-03T21:32:02","guid":{"rendered":"https:\/\/m4.ist\/index.php\/2026\/05\/04\/yerel-yapay-zeka-sunucusu-threadripper-3960x-asus-prime\/"},"modified":"2026-05-04T00:32:02","modified_gmt":"2026-05-03T21:32:02","slug":"yerel-yapay-zeka-sunucusu-threadripper-3960x-asus-prime","status":"publish","type":"post","link":"https:\/\/m4.ist\/index.php\/2026\/05\/04\/yerel-yapay-zeka-sunucusu-threadripper-3960x-asus-prime\/","title":{"rendered":"Threadripper 3960X ASUS Prime: 2026 Guide"},"content":{"rendered":"<h1>Threadripper 3960X ve ASUS Prime TRX40-Pro: 3x RTX 3090 Yerel AI Sunucusunun Ger\u00e7ek Y\u00f6netim Rehberi<\/h1>\n<div class=\"rankmath-manual-toc\" data-rankmath-toc=\"1\">\n<p><strong>\u0130\u00e7indekiler<\/strong><\/p>\n<ul>\n<li><a href=\"#threadripper-3960x-asus-prime-trx40-platformu-ve-3x\">Threadripper 3960x asus prime: TRX40 Platformu ve 3x 3090 Konfig\u00fcrasyonunun Teknik Ger\u00e7ekleri<\/a><\/li>\n<li><a href=\"#donanim-mimarisi-ve-gerceklik-payi-tablosu\">Donan\u0131m Mimarisi ve Ger\u00e7eklik Pay\u0131 Tablosu<\/a><\/li>\n<li><a href=\"#elektrik-tuketimi-ve-termal-yonetim-stratejileri\">Elektrik T\u00fcketimi ve Termal Y\u00f6netim Stratejileri<\/a><\/li>\n<li><a href=\"#pcie-lane-dagilimi-ve-bellek-yonetimi-stratejileri\">PCIe Lane Da\u011f\u0131l\u0131m\u0131 ve Bellek Y\u00f6netimi Stratejileri<\/a><\/li>\n<li><a href=\"#vllm-ile-yuksek-baslangicli-cikti-ttft-optimizasyonu\">vLLM ile Y\u00fcksek Ba\u015flang\u0131\u00e7l\u0131 \u00c7\u0131kt\u0131 (TTFT) Optimizasyonu<\/a><\/li>\n<li><a href=\"#llamacpp-ile-hafif-ve-esnek-cerceve\">llama.cpp ile Hafif ve Esnek \u00c7er\u00e7eve<\/a><\/li>\n<li><a href=\"#ollama-ile-kullanici-dostu-yonetim-ve-entegrasyon\">Ollama ile Kullan\u0131c\u0131 Dostu Y\u00f6netim ve Entegrasyon<\/a><\/li>\n<li><a href=\"#karsilastirma-vllm-llamacpp-ve-ollama-arasindaki-farklar\">Kar\u015f\u0131la\u015ft\u0131rma: vLLM, llama.cpp ve Ollama Aras\u0131ndaki Farklar<\/a><\/li>\n<li><a href=\"#sisteminiz-icin-elektrik-ve-isi-maliyet-analizi\">Sisteminiz \u0130\u00e7in Elektrik ve Is\u0131 Maliyet Analizi<\/a><\/li>\n<li><a href=\"#kurulum-oncesi-kontrol-listesi\">Kurulum \u00d6ncesi Kontrol Listesi<\/a><\/li>\n<li><a href=\"#sistem-kararliligi-ve-bakim-kontrol-listesi\">Sistem Kararl\u0131l\u0131\u011f\u0131 ve Bak\u0131m Kontrol Listesi<\/a><\/li>\n<li><a href=\"#sorun-giderme-ve-yaygin-hatalar\">Sorun Giderme ve Yayg\u0131n Hatalar<\/a><\/li>\n<li><a href=\"#sikca-sorulan-sorular-sss\">S\u0131k\u00e7a Sorulan Sorular (SSS)<\/a><\/li>\n<li><a href=\"#1-3x-3090-sistemi-trx40-uzerinde-ne-kadar\">1. 3x 3090 sistemi TRX40 \u00fczerinde ne kadar \u0131s\u0131n\u0131r?<\/a><\/li>\n<li><a href=\"#2-vllm-ve-llamacpp-arasinda-bu-donanimda-fark\">2. vLLM ve llama.cpp aras\u0131nda bu donan\u0131mda fark nedir?<\/a><\/li>\n<li><a href=\"#3-elektrik-faturasi-ne-kadar-artar\">3. Elektrik faturas\u0131 ne kadar artar?<\/a><\/li>\n<li><a href=\"#4-pcie-x4x8-darbogazi-performansimi-ne-kadar-etkiler\">4. PCIe x4\/x8 darbo\u011faz\u0131 performans\u0131m\u0131 ne kadar etkiler?<\/a><\/li>\n<li><a href=\"#5-bu-sistemle-70b-parametreli-modeli-calistirabilir-miyim\">5. Bu sistemle 70B parametreli modeli \u00e7al\u0131\u015ft\u0131rabilir miyim?<\/a><\/li>\n<li><a href=\"#sonuc-ve-ileri-adim-onerileri\">Sonu\u00e7 ve \u0130leri Ad\u0131m \u00d6nerileri<\/a><\/li>\n<li><a href=\"#zorlu-senaryo-24-saatlik-surekli-yuk-ve-model\">Zorlu Senaryo: 24 Saatlik S\u00fcrekli Y\u00fck ve Model Swapping<\/a><\/li>\n<li><a href=\"#uygulama-oncesi-kontrol-listesi-trx40-ve-3x-3090\">Uygulama \u00d6ncesi Kontrol Listesi: TRX40 ve 3x 3090 Entegrasyonu<\/a><\/li>\n<li><a href=\"#sorun-giderme-ve-yaygin-hata-analizi\">Sorun Giderme ve Yayg\u0131n Hata Analizi<\/a><\/li>\n<li><a href=\"#karsilastirma-tablosu-model-yukleme-ve-yonetim-stratejileri\">Kar\u015f\u0131la\u015ft\u0131rma Tablosu: Model Y\u00fckleme ve Y\u00f6netim Stratejileri<\/a><\/li>\n<li><a href=\"#pratik-senaryo-model-degisimi-ve-bellek-yonetimi\">Pratik Senaryo: Model De\u011fi\u015fimi ve Bellek Y\u00f6netimi<\/a><\/li>\n<li><a href=\"#sikca-sorulan-sorular-sss-2\">S\u0131k\u00e7a Sorulan Sorular (SSS)<\/a><\/li>\n<\/ul>\n<\/div>\n<p>Threadripper 3960x asus prime bu rehberin ilk adimindan itibaren merkezinde yer alir. Bu rehberi yaz\u0131yorum \u00e7\u00fcnk\u00fc \u00e7o\u011fu kaynak, bir Threadripper 3960X ve \u00fc\u00e7 adet RTX 3090&#8217;\u0131 bir arada \u00e7al\u0131\u015ft\u0131rmay\u0131, teorik olarak m\u00fcmk\u00fcn olan &#8220;m\u00fckemmel bir sistem&#8221; gibi sunuyor. Ancak kendi faturalar\u0131m\u0131 \u00f6dedi\u011fim i\u00e7in ve y\u0131llard\u0131r homelab&#8217;lar\u0131mda \u00e7al\u0131\u015fan bir sistem operat\u00f6r\u00fc olarak size ger\u00e7e\u011fi s\u00f6ylemeliyim: Bu kombinasyon, ev ortam\u0131nda \u00e7al\u0131\u015ft\u0131r\u0131ld\u0131\u011f\u0131nda fiziksel s\u0131n\u0131rlar\u0131n\u0131 hemen g\u00f6sterecek, \u0131s\u0131 ve enerji t\u00fcketimi konusunda sizi test edecektir. AMA, do\u011fru yap\u0131land\u0131rma ve y\u00f6neti\u015fimle, 24GB VRAM&#8217;in yetersiz kald\u0131\u011f\u0131 yerlerde, 72GB&#8217;l\u0131k toplam bellek kapasitesi ve y\u00fcksek bant geni\u015fli\u011fi ile yerel yapay zeka modellerini \u00e7al\u0131\u015ft\u0131rmak i\u00e7in hala en g\u00fc\u00e7l\u00fc ev tabanl\u0131 \u00e7\u00f6z\u00fcm olma niteli\u011fini korur.<\/p>\n<p>Bu bolum Threadripper 3960x asus prime odagini koruyarak ilerler: Amac\u0131m\u0131z, &#8220;nas\u0131l tak\u0131l\u0131r&#8221; sorusuna basit bir cevap vermek de\u011fil. Bu makale, PCIe lane darl\u0131klar\u0131n\u0131n, termal t\u0131kan\u0131kl\u0131klar\u0131n ve yaz\u0131l\u0131m\u0131n donan\u0131m\u0131 nas\u0131l y\u00f6netti\u011finin derinlemesine bir analizi olacakt\u0131r. vLLM, llama.cpp ve Ollama gibi ara\u00e7lar\u0131 bu donan\u0131m \u00fczerinde \u00e7al\u0131\u015ft\u0131r\u0131rken kar\u015f\u0131la\u015faca\u011f\u0131n\u0131z ger\u00e7ek darbo\u011fazlar\u0131, maliyet analizlerini ve ka\u00e7\u0131n\u0131lmaz sorun giderme senaryolar\u0131n\u0131, spek\u00fclasyonlardan ar\u0131nd\u0131r\u0131lm\u0131\u015f bir \u015fekilde ele alaca\u011f\u0131z. E\u011fer faturan\u0131z\u0131n artaca\u011f\u0131n\u0131 kabul edip, sistemi so\u011futma konusunda ciddi ad\u0131mlar atma iradesine sahipseniz, bu rehber sizin i\u00e7in do\u011fru ba\u015flang\u0131\u00e7 noktas\u0131d\u0131r.<\/p>\n<h2 id=\"threadripper-3960x-asus-prime-trx40-platformu-ve-3x\">Threadripper 3960x asus prime: TRX40 Platformu ve 3x 3090 Konfig\u00fcrasyonunun Teknik Ger\u00e7ekleri<\/h2>\n<p>ASUS Prime TRX40-Pro anakart\u0131 ve AMD Ryzen Threadripper 3960X i\u015flemcisi, yerel yapay zeka d\u00fcnyas\u0131nda uzun y\u0131llar boyunca tart\u0131\u015f\u0131lan bir kombinasyon olmu\u015ftur. Ancak bu ikilinin \u00fc\u00e7 adet RTX 3090&#8217;\u0131 birlikte \u00e7al\u0131\u015ft\u0131rma kapasitesi, sadece anakart\u0131n slot say\u0131s\u0131 kadar basit bir konu de\u011fildir. Burada devreye giren \u015fey, i\u015flemciye sa\u011flanan PCIe lane (\u015ferit) say\u0131s\u0131 ve bu \u015feritlerin da\u011f\u0131l\u0131m mant\u0131\u011f\u0131d\u0131r. TRX40 platformu, i\u015flemcinin sundu\u011fu 128 adet PCIe 4.0 lane&#8217;ini, genellikle \u00e7iftli veya d\u00f6rtl\u00fc gruplar halinde ay\u0131r\u0131r. Bu da 3x 3090&#8217;\u0131n hepsinin tam PCIe x16 h\u0131z\u0131nda \u00e7al\u0131\u015fabilece\u011fi anlam\u0131na gelmez.<\/p>\n<p>\u0130lk ger\u00e7ek, donan\u0131m\u0131n\u0131z\u0131n fiziksel yerle\u015fimi ile yaz\u0131l\u0131m\u0131n\u0131z\u0131n bellek y\u00f6netimi aras\u0131ndaki uyumdur. RTX 3090&#8217;lar her biri 24GB VRAM&#8217;e sahiptir ve toplamda 72GB VRAM elde edersiniz. Bu, 70 milyar parametreli modelleri (\u00f6rne\u011fin Llama-2-70B) 4-bit veya 8-bit quantizasyonla \u00e7al\u0131\u015ft\u0131rmak i\u00e7in neredeyse tek ba\u015f\u0131na yeterli bir kapasitedir. Ancak, TRX40&#8217;un PCIe topolojisi, ikinci ve \u00fc\u00e7\u00fcnc\u00fc slotlar\u0131n genellikle x8 veya hatta x4 h\u0131z\u0131nda \u00e7al\u0131\u015fabilece\u011fini belirtir. Bu, veri transfer h\u0131z\u0131nda bir k\u0131s\u0131tlama yaratabilir. \u00d6rne\u011fin, vLLM gibi dinamik bellek y\u00f6netimi yapan bir motor, verileri GPU&#8217;lar aras\u0131nda s\u00fcrekli olarak e\u015flerken bu bant geni\u015fli\u011fi darbo\u011faz\u0131n\u0131n fark\u0131na varacakt\u0131r. Bu darbo\u011faz, modelin tamamen y\u00fcklendi\u011fi ve sadece \u00e7al\u0131\u015ft\u0131r\u0131ld\u0131\u011f\u0131 senaryolarda fark edilmez, ancak s\u00fcrekli veri ak\u0131\u015f\u0131 ve model par\u00e7alama (split) gerektiren i\u015f y\u00fcklerinde performans d\u00fc\u015f\u00fc\u015f\u00fcne neden olabilir.<\/p>\n<p>Bu konfig\u00fcrasyonun bir di\u011fer kritik y\u00f6n\u00fc, anakart\u0131n fiziksel \u0131s\u0131 da\u011f\u0131l\u0131m\u0131d\u0131r. \u00dc\u00e7 adet 3090&#8217;\u0131, genellikle birbirine \u00e7ok yak\u0131n olan PCIe slotlar\u0131na yerle\u015ftirmek zorundas\u0131n\u0131z. Bu, \u0131s\u0131y\u0131 havaya aktarma konusunda ciddi bir rekabet yarat\u0131r. \u0130lk kart \u0131l\u0131man olabilirken, ikinci ve \u00fc\u00e7\u00fcnc\u00fc kartlar, ilk kart\u0131n sal\u0131n\u0131mlar\u0131na maruz kald\u0131\u011f\u0131 i\u00e7in genellikle daha y\u00fcksek s\u0131cakl\u0131klarda \u00e7al\u0131\u015f\u0131r. Bu durum, termal throttling (\u0131s\u0131l k\u0131s\u0131tlama) riskini art\u0131r\u0131r. Yani, i\u015flemci g\u00fcc\u00fcn\u00fcz\u00fcn %100&#8217;\u00fcn\u00fc kullanamazs\u0131n\u0131z, \u00e7\u00fcnk\u00fc sistem kendini korumak i\u00e7in saat h\u0131z\u0131n\u0131 d\u00fc\u015f\u00fcrecektir. Bu noktada, anakart\u0131n so\u011futma tasar\u0131m\u0131 ve kasan\u0131n hava ak\u0131\u015f\u0131, donan\u0131m\u0131n en zay\u0131f halkas\u0131 haline gelir. Sadece anakart\u0131n slotlar\u0131n\u0131 doldurmak yetmez; sistemin nas\u0131l nefes alaca\u011f\u0131n\u0131 planlaman\u0131z gerekir.<\/p>\n<h3 id=\"donanim-mimarisi-ve-gerceklik-payi-tablosu\">Donan\u0131m Mimarisi ve Ger\u00e7eklik Pay\u0131 Tablosu<\/h3>\n<p>A\u015fa\u011f\u0131daki tablo, TRX40 platformunun 3x RTX 3090 konfig\u00fcrasyonunda kar\u015f\u0131la\u015ft\u0131\u011f\u0131n\u0131z lane da\u011f\u0131l\u0131m\u0131n\u0131 ve bunun performans \u00fczerindeki etkisini g\u00f6rselle\u015ftirir. Bu, sadece teknik bir tablo de\u011fil, sisteminizdeki darbo\u011fazlar\u0131n haritas\u0131d\u0131r.<\/p>\n<p>| Konfig\u00fcrasyon | Slot Pozisyonu | \u00c7al\u0131\u015fma H\u0131z\u0131 (Ger\u00e7ek\u00e7i) | Etki Alan\u0131 | \u00d6nemli Notlar |<br \/>\n| :&#8212; | :&#8212; | :&#8212; | :&#8212; :&#8212; |<br \/>\n| <strong>GPU 1<\/strong> | PCIe x16 (1. Slot) | PCIe x16 | Ana Model Y\u00fckleme | En d\u00fc\u015f\u00fck gecikme, en y\u00fcksek bant geni\u015fli\u011fi. \u0130lk tercih edilmeli. |<br \/>\n| <strong>GPU 2<\/strong> | PCIe x16 (2. Slot) | PCIe x8 | Model Par\u00e7alama (Split) | Bant geni\u015fli\u011fi yar\u0131ya d\u00fc\u015fer. Veri transferlerinde %15-20 yava\u015flama beklenir. |<br \/>\n| <strong>GPU 3<\/strong> | PCIe x16 (3. Slot) | PCIe x4 veya x8 | Ekstra Bellek Deste\u011fi | En y\u00fcksek risk. x4 ise b\u00fcy\u00fck modellerde ciddi darbo\u011faz olu\u015fur. |<\/p>\n<p>Bu tablo, sisteminizin teorik kapasitesinden ziyade, pratikte ne kadar verimli \u00e7al\u0131\u015ft\u0131\u011f\u0131n\u0131 g\u00f6sterir. E\u011fer 3. GPU&#8217;yu sadece ek bellek i\u00e7in kullanacaksan\u0131z, h\u0131z fark\u0131 \u00f6nemli olmayabilir. Ancak modelin bu kart\u0131 aktif olarak kullanmas\u0131 gerekiyorsa, verinin di\u011fer kartlardan oraya aktar\u0131lmas\u0131 i\u00e7in ge\u00e7en zaman, toplam i\u015flem s\u00fcresini uzatabilir. Bu nedenle, modelin par\u00e7alanma stratejisi (split strategy) se\u00e7imi, donan\u0131m\u0131n\u0131z\u0131n bu fiziksel k\u0131s\u0131tlamalar\u0131na g\u00f6re ayarlanmal\u0131d\u0131r. Unutmay\u0131n, bu bir lab \u00e7al\u0131\u015fmas\u0131 de\u011fil, sizin evinizdeki \u0131s\u0131 ve elektrik faturas\u0131yla \u00e7al\u0131\u015ft\u0131raca\u011f\u0131n\u0131z bir sistemdir.<\/p>\n<h2 id=\"elektrik-tuketimi-ve-termal-yonetim-stratejileri\">Elektrik T\u00fcketimi ve Termal Y\u00f6netim Stratejileri<\/h2>\n<p>Bu konfig\u00fcrasyonun en b\u00fcy\u00fck zorlu\u011fu, \u0131s\u0131 ve elektrik maliyetidir. S\u0131radan bir ofis bilgisayar\u0131ndan evdeki bir sunucuya ge\u00e7i\u015f yaparken, faturan\u0131z\u0131n art\u0131\u015f\u0131 size bir &#8220;\u015fok&#8221; etkisi yaratacakt\u0131r. Her bir RTX 3090, tam y\u00fck alt\u0131nda (stress test veya model \u00e7al\u0131\u015ft\u0131rma) yakla\u015f\u0131k 350W-400W g\u00fc\u00e7 t\u00fcketebilir. Sistemde 3 adet varsa, sadece GPU&#8217;lardan 1050W-1200W t\u00fcketim s\u00f6z konusudur. Threadripper 3960X i\u015flemci, 24 \u00e7ekirdekli yap\u0131s\u0131yla idare haldeyken 100W-150W \u00e7ekerken, tam y\u00fck alt\u0131nda 300W-350W seviyesine \u00e7\u0131kabilir. Ana kart, RAM (256GB DDR4), SSD&#8217;ler ve fanlar da bu tabloya yakla\u015f\u0131k 150W ekleyecektir.<\/p>\n<p>Bu durumda, sisteminiz tam y\u00fck alt\u0131nda (GPU&#8217;lar %100 kullan\u0131mda) yakla\u015f\u0131k 1.5kW &#8211; 1.7kW g\u00fc\u00e7 \u00e7eker. Bir g\u00fcn boyunca (24 saat) bu y\u00fckte \u00e7al\u0131\u015ft\u0131r\u0131rsan\u0131z, t\u00fcketti\u011finiz enerji 36kWh &#8211; 40kWh olur. Yerel elektrik fiyatlar\u0131na (\u00f6rne\u011fin, T\u00fcrkiye&#8217;de son d\u00f6nemlerde end\u00fcstriyel veya konut fiyatlar\u0131na g\u00f6re de\u011fi\u015fen ortalama 1.5 TL\/kWh) g\u00f6re, bu tek bir g\u00fcn i\u00e7in 54-60 TL tutar. Ayl\u0131k bazda bu rakam 1.600 TL ile 1.800 TL aras\u0131nda de\u011fi\u015febilir. Bu rakam, sadece bu sunucunun faturaya yans\u0131yan k\u0131sm\u0131d\u0131r. Ancak unutmay\u0131n, modelin e\u011fitimi veya uzun s\u00fcreli inferans i\u015flemleri s\u0131ras\u0131nda bu de\u011ferler, i\u015flemcinin de tam y\u00fckte \u00e7al\u0131\u015fmas\u0131yla daha da artabilir.<\/p>\n<p>Is\u0131 y\u00f6netimi ise bu elektrik faturas\u0131n\u0131n do\u011frudan bir sonucudur. 3x 3090, ev ortam\u0131nda 1.5kW&#8217;l\u0131k bir \u0131s\u0131 kayna\u011f\u0131d\u0131r. Yaz\u0131n, odan\u0131z\u0131n s\u0131cakl\u0131\u011f\u0131 30\u00b0C&#8217;yi buldu\u011funda, bu \u0131s\u0131y\u0131 d\u0131\u015far\u0131 atmak i\u00e7in fanlar\u0131n\u0131z\u0131n daha h\u0131zl\u0131 d\u00f6nmesi gerekir, bu da ek bir elektrik t\u00fcketimi yarat\u0131r. Ayr\u0131ca, \u0131s\u0131 artt\u0131k\u00e7a donan\u0131m\u0131n \u00f6mr\u00fc k\u0131sal\u0131r. VRAM s\u0131cakl\u0131klar\u0131 85\u00b0C&#8217;nin \u00fczerine \u00e7\u0131karsa, NVIDIA kartlar kendini korumaya al\u0131p performans\u0131n\u0131 d\u00fc\u015f\u00fcr\u00fcr veya sisteminiz \u00e7\u00f6ker. Bu nedenle, sadece &#8220;g\u00fc\u00e7 kayna\u011f\u0131 var m\u0131?&#8221; sorusuna de\u011fil, &#8220;hava ak\u0131\u015f\u0131 nas\u0131l?&#8221; sorusuna odaklanmal\u0131s\u0131n\u0131z.<\/p>\n<p><strong>Termal Y\u00f6netim \u0130\u00e7in Kritik Stratejiler:<\/strong><\/p>\n<ol>\n<li><strong>Zorunlu Fan Kontrol\u00fc:<\/strong> Fabrika ayar\u0131 fan profili, ev ortam\u0131 i\u00e7in genellikle yetersizdir. BIOS veya GPU yaz\u0131l\u0131mlar\u0131 (MSI Afterburner vb.) ile fanlar\u0131 %100&#8217;e yak\u0131n bir \u015fekilde sabitlemeniz gerekebilir. Bu sesli bir sistem yarat\u0131r ama donan\u0131m\u0131n\u0131z\u0131n ya\u015famas\u0131 i\u00e7in \u015fartt\u0131r.<\/li>\n<li><strong>Dijital Termal Macun:<\/strong> RTX 3090&#8217;lar, orijinal macunlar\u0131n zamanla kurumas\u0131na \u00e7ok duyarl\u0131d\u0131r. Sistemde 3 kart varsa, bunlar\u0131n hepsini d\u00fczenli olarak (y\u0131lda bir kez) macun de\u011fi\u015fimi yapmak, termal direnci d\u00fc\u015f\u00fcr\u00fcr.<\/li>\n<li><strong>Hava Ak\u0131\u015f\u0131 Optimizasyonu:<\/strong> Kasan\u0131n giri\u015f ve \u00e7\u0131k\u0131\u015f fanlar\u0131n\u0131 do\u011fru konumland\u0131rmal\u0131s\u0131n\u0131z. S\u0131cak havan\u0131n i\u00e7eride hapsolmamas\u0131 i\u00e7in &#8220;d\u00fczg\u00fcn hava ak\u0131\u015f\u0131&#8221; (positive or negative pressure) prensibi hayati \u00f6nem ta\u015f\u0131r.<\/li>\n<li><strong>Is\u0131l Yal\u0131t\u0131m:<\/strong> GPU&#8217;lar birbirine \u00e7ok yak\u0131nsa, \u0131s\u0131 transferi artar. M\u00fcmk\u00fcnse aralar\u0131na hava bo\u015flu\u011fu b\u0131rak\u0131n veya \u00f6zel termal yal\u0131t\u0131c\u0131lar kullan\u0131n.<\/li>\n<\/ol>\n<p>Elektrik maliyeti hesaplama konusunda kesin bir rakam vermek imkans\u0131zd\u0131r \u00e7\u00fcnk\u00fc b\u00f6lgesel fiyat farklar\u0131 vard\u0131r. Ancak form\u00fcl basittir: <code>(Toplam G\u00fc\u00e7 kW) x (\u00c7al\u0131\u015fma Saati) x (Yerel Birim Fiyat) = Ayl\u0131k Maliyet<\/code>. Bu form\u00fcl\u00fc kendi faturan\u0131z \u00fczerinden uygulad\u0131\u011f\u0131n\u0131zda, sisteminizin &#8220;k\u00e2r-zarar&#8221; noktas\u0131n\u0131 g\u00f6rebilirsiniz. E\u011fer evinizde sanayi tipi elektrik \u015febekesi yoksa, bu g\u00fc\u00e7 t\u00fcketimi baz\u0131n\u0131z\u0131n artmas\u0131na neden olabilir. Bu nedenle, sistemi 7\/24 tam y\u00fckte \u00e7al\u0131\u015ft\u0131rmak yerine, i\u015f y\u00fck\u00fcn\u00fc b\u00f6l\u00fcmlere ay\u0131rarak (batch processing) y\u00f6netmek, maliyet a\u00e7\u0131s\u0131ndan daha ak\u0131lc\u0131 bir stratejidir.<\/p>\n<h2 id=\"pcie-lane-dagilimi-ve-bellek-yonetimi-stratejileri\">PCIe Lane Da\u011f\u0131l\u0131m\u0131 ve Bellek Y\u00f6netimi Stratejileri<\/h2>\n<p>3x RTX 3090 konfig\u00fcrasyonunda en \u00f6nemli teknik k\u0131s\u0131tlama, PCIe lane (\u015ferit) da\u011f\u0131l\u0131m\u0131d\u0131r. TRX40 platformu, i\u015flemciye 128 adet PCIe 4.0 lane sunar. Ancak bu lane&#8217;ler, anakart \u00fczerindeki slotlara tam olarak x16 \u015feklinde da\u011f\u0131lmaz. ASUS Prime TRX40-Pro&#8217;nun teknik \u00f6zelliklerine bak\u0131ld\u0131\u011f\u0131nda, genellikle ilk slot x16 \u00e7al\u0131\u015f\u0131rken, ikinci slot x8 ve \u00fc\u00e7\u00fcnc\u00fc slot x4 veya x8 h\u0131z\u0131nda \u00e7al\u0131\u015f\u0131r. Bu durum, 3. GPU&#8217;nun di\u011ferlerine g\u00f6re veri transfer h\u0131z\u0131nda \u00f6nemli bir darbo\u011fazla kar\u015f\u0131la\u015fmas\u0131na neden olur.<\/p>\n<p>Bu darbo\u011faz, \u00f6zellikle vLLM veya llama.cpp gibi modelleri GPU&#8217;lar aras\u0131nda par\u00e7alayan (tensor splitting) yaz\u0131l\u0131mlarda kritiktir. Modelin bir par\u00e7as\u0131 ilk GPU&#8217;da, di\u011feri ikinci GPU&#8217;da ve \u00fc\u00e7\u00fcnc\u00fc par\u00e7as\u0131 \u00fc\u00e7\u00fcnc\u00fc GPU&#8217;da bulunuyorsa, her bir i\u015flem i\u00e7in veri bu GPU&#8217;lar aras\u0131nda s\u00fcrekli olarak transfer edilmek zorundad\u0131r. \u00dc\u00e7\u00fcnc\u00fc GPU x4 h\u0131z\u0131nda \u00e7al\u0131\u015f\u0131yorsa, veri transferi x16&#8217;ya g\u00f6re 4 kat daha yava\u015f olacakt\u0131r. Bu da, modelin \u00fcretme h\u0131z\u0131n\u0131 (tokens\/second) do\u011frudan etkiler. \u00d6rne\u011fin, 100 token\/saniye \u00fcretmesi gereken bir model, 3. GPU&#8217;daki veri transferi yava\u015flamas\u0131 nedeniyle 80 token\/saniyeye d\u00fc\u015febilir. Bu, sisteminizin teorik performans\u0131ndan ziyade, en yava\u015f h\u00fccresine (bottleneck) g\u00f6re \u00e7al\u0131\u015ft\u0131\u011f\u0131n\u0131 g\u00f6sterir.<\/p>\n<p>Bellek y\u00f6netimi stratejisi ise bu darbo\u011faz\u0131 hafifletmek i\u00e7in kritik bir rol oynar. 3x 3090, toplam 72GB VRAM sunar. Ancak bu belle\u011fin tamam\u0131n\u0131 verimli kullanmak i\u00e7in, modellerin hangi GPU&#8217;ya y\u00fcklenece\u011fini dikkatlice ayarlaman\u0131z gerekir. B\u00fcy\u00fck modelleri (\u00f6rn: 70B parametre) tek bir GPU&#8217;ya s\u0131\u011fd\u0131ramazs\u0131n\u0131z, bu y\u00fczden model par\u00e7alanmas\u0131 (tensor parallelism) gerekir. vLLM gibi motorlar, bunu otomatik olarak yapabilir, ancak TRX40&#8217;un lane da\u011f\u0131l\u0131m\u0131 nedeniyle, manuel m\u00fcdahale etmek gerekebilir. \u0130lk GPU&#8217;ya en b\u00fcy\u00fck model par\u00e7as\u0131n\u0131 y\u00fckleyip, geri kalanlar\u0131n\u0131 di\u011fer GPU&#8217;lara da\u011f\u0131tmak, genel verimlili\u011fi art\u0131rabilir. Ancak 3. GPU&#8217;nun h\u0131z s\u0131n\u0131rlamas\u0131 nedeniyle, bu GPU&#8217;ya \u00e7ok fazla y\u00fck bindirmemek, dengeyi korumak i\u00e7in mant\u0131kl\u0131 bir yakla\u015f\u0131md\u0131r.<\/p>\n<p>Ayr\u0131ca, bellek bant geni\u015fli\u011fi (memory bandwidth) konusunda da dikkatli olunmal\u0131d\u0131r. RTX 3090, 936 GB\/s bant geni\u015fli\u011fine sahiptir. \u00dc\u00e7 kart olsa bile, bu bant geni\u015fli\u011fi sadece yerel olarak kullan\u0131l\u0131r. PCIe yoluyla yap\u0131lan veri transferleri, bu yerel bant geni\u015fli\u011finin \u00e7ok gerisinde kal\u0131r. Bu nedenle, CPU ve RAM aras\u0131 veri transferi yaparken (model y\u00fckleme a\u015famas\u0131nda) veya GPU&#8217;lar aras\u0131 veri payla\u015f\u0131m\u0131nda (inferans s\u0131ras\u0131nda) PCIe bant geni\u015fli\u011fi, sistem h\u0131z\u0131n\u0131n en kritik fakt\u00f6r\u00fc haline gelir. E\u011fer sistemde 256GB DDR4 RAM varsa, bu RAM&#8217;in bant geni\u015fli\u011fi de g\u00f6z \u00f6n\u00fcnde bulundurulmal\u0131d\u0131r. DDR4, DDR5&#8217;e g\u00f6re daha d\u00fc\u015f\u00fck bant geni\u015fli\u011fine sahiptir, bu da b\u00fcy\u00fck modellerin y\u00fcklenmesi s\u0131ras\u0131nda CPU&#8217;nun RAM&#8217;den yeterli h\u0131zda veri \u00e7ekememesine neden olabilir.<\/p>\n<p><strong>Bellek Y\u00f6netimi \u0130\u00e7in \u00d6nerilen Yakla\u015f\u0131mlar:<\/strong><\/p>\n<ol>\n<li><strong>Model Par\u00e7alama (Tensor Split):<\/strong> Modeli m\u00fcmk\u00fcn oldu\u011funca dengeli bir \u015fekilde da\u011f\u0131tmak yerine, yava\u015f kartlara (x4 slot) daha az y\u00fck bindirmeye odaklan\u0131n.<\/li>\n<li><strong>CPU Offload:<\/strong> E\u011fer model tam olarak GPU&#8217;lara s\u0131\u011fm\u0131yorsa, baz\u0131 katmanlar\u0131 RAM&#8217;e (CPU) y\u00fckleyin. Bu, h\u0131z\u0131 d\u00fc\u015f\u00fcr\u00fcr ama \u00e7\u00f6k\u00fc\u015fleri \u00f6nler.<\/li>\n<li><strong>PCIe Bant Geni\u015fli\u011fi Kontrol\u00fc:<\/strong> Sisteminizde hangi PCIe slotlar\u0131n\u0131n ne h\u0131zda \u00e7al\u0131\u015ft\u0131\u011f\u0131n\u0131 <code>lspci<\/code> komutuyla kontrol edin. Bekledi\u011finiz h\u0131zda \u00e7al\u0131\u015fm\u0131yorsa, BIOS ayarlar\u0131n\u0131 g\u00f6zden ge\u00e7irin.<\/li>\n<\/ol>\n<p>Bu stratejiler, donan\u0131m\u0131n fiziksel s\u0131n\u0131rlamalar\u0131n\u0131 kabul ederek, yaz\u0131l\u0131m\u0131n bu s\u0131n\u0131rlar i\u00e7inde en verimli \u015fekilde \u00e7al\u0131\u015fmas\u0131n\u0131 sa\u011flamak i\u00e7in gereklidir. &#8220;M\u00fckemmel bir sistem&#8221; yoktur, sadece mevcut donan\u0131mla en iyi sonucu alabilen bir y\u00f6neti\u015fim vard\u0131r.<\/p>\n<h2 id=\"vllm-ile-yuksek-baslangicli-cikti-ttft-optimizasyonu\">vLLM ile Y\u00fcksek Ba\u015flang\u0131\u00e7l\u0131 \u00c7\u0131kt\u0131 (TTFT) Optimizasyonu<\/h2>\n<p>vLLM, yerel yapay zeka sunucular\u0131nda y\u00fcksek ba\u015flang\u0131\u00e7l\u0131 \u00e7\u0131kt\u0131 (Time To First Token &#8211; TTFT) ve y\u00fcksek throughput (tok\/kak) sa\u011flamak i\u00e7in tasarlanm\u0131\u015f bir motor olarak \u00f6ne \u00e7\u0131kar. Ancak 3x RTX 3090 gibi \u00e7oklu GPU konfig\u00fcrasyonlar\u0131nda, vLLM&#8217;in performans\u0131n\u0131 optimize etmek, sadece &#8220;\u00e7al\u0131\u015ft\u0131r&#8221; komutu vermekle olmaz. vLLM, modelin katmanlar\u0131n\u0131 GPU&#8217;lar aras\u0131nda da\u011f\u0131t\u0131rken, PCIe lane darbo\u011fazlar\u0131n\u0131 ve bellek y\u00f6netimini dikkate almal\u0131d\u0131r. \u00d6zellikle TRX40 platformunda, 3. GPU&#8217;nun x4 h\u0131z\u0131nda \u00e7al\u0131\u015fmas\u0131, vLLM&#8217;in otomatik da\u011f\u0131t\u0131m stratejilerini etkileyebilir.<\/p>\n<p>vLLM&#8217;in temel avantaj\u0131, PagedAttention mekanizmas\u0131 sayesinde VRAM kullan\u0131m\u0131n\u0131 optimize etmesidir. 70B parametreli bir modeli (\u00f6rn: Llama-2-70B) tek bir 3090&#8217;a s\u0131\u011fd\u0131rmak imkans\u0131zd\u0131r. Bu durumda, vLLM modelin katmanlar\u0131n\u0131 GPU&#8217;lara b\u00f6l\u00fc\u015ft\u00fcr\u00fcr. Ancak TRX40&#8217;un lane da\u011f\u0131l\u0131m\u0131 nedeniyle, verilerin GPU&#8217;lar aras\u0131nda transferi yava\u015flayabilir. Bu da, ilk tokenin \u00fcretilme s\u00fcresini (TTFT) art\u0131r\u0131r. Kullan\u0131c\u0131, komutu girdikten sonra cevab\u0131 beklemek zorunda kal\u0131r. Bu gecikme, \u00f6zellikle interaktif (sohbet tarz\u0131) kullan\u0131m i\u00e7in rahats\u0131z edici olabilir.<\/p>\n<p><strong>vLLM Optimizasyonu \u0130\u00e7in Stratejiler:<\/strong><\/p>\n<ul>\n<li><strong>Model Par\u00e7alama Stratejisi:<\/strong> vLLM, varsay\u0131lan olarak model katmanlar\u0131n\u0131 GPU&#8217;lara e\u015fit olarak da\u011f\u0131t\u0131r. Ancak TRX40 konfig\u00fcrasyonunda, bu strateji optimize edilmelidir. \u0130lk GPU (x16) ve ikinci GPU (x8) daha fazla y\u00fck\u00fc \u00fcstlenmeli, \u00fc\u00e7\u00fcnc\u00fc GPU (x4) daha az y\u00fck\u00fc \u00fcstlenmelidir. Bu, modelin par\u00e7alama stratejisini manuel olarak ayarlayarak yap\u0131labilir.<\/li>\n<li><strong>Tensor Parallelism:<\/strong> <code>tensor-parallel-size<\/code> parametresi, vLLM&#8217;in GPU&#8217;lar\u0131 nas\u0131l kulland\u0131\u011f\u0131n\u0131 belirler. 3 kartl\u0131 bir sistemde bu de\u011feri 3 olarak ayarlamak, 3. GPU&#8217;nun darbo\u011faz\u0131n\u0131 tetikleyebilir. Bunu 2 olarak ayarlay\u0131p, 3. GPU&#8217;yu sadece ek bellek olarak kullanmak (model split&#8217;i) daha iyi bir performans sa\u011flayabilir.<\/li>\n<li><strong>Block Cache Y\u00f6netimi:<\/strong> vLLM&#8217;in kulland\u0131\u011f\u0131 bellek bloklar\u0131n\u0131n boyutu, sistemde ne kadar \u00e7ok paralel istek i\u015flenece\u011fini belirler. B\u00fcy\u00fck GPU&#8217;lu sistemlerde, blok boyutunu art\u0131rarak bellek kullan\u0131m\u0131 optimize edilebilir.<\/li>\n<\/ul>\n<p>vLLM&#8217;i \u00e7al\u0131\u015ft\u0131rmak i\u00e7in kullanaca\u011f\u0131n\u0131z temel komut sat\u0131r\u0131 \u00f6rne\u011fi a\u015fa\u011f\u0131dad\u0131r. Bu komut, modelin GPU&#8217;lar aras\u0131nda nas\u0131l da\u011f\u0131t\u0131laca\u011f\u0131n\u0131 belirler. Dikkat ederseniz, <code>tensor-parallel-size<\/code> parametresi kritik bir rol oynar.<\/p>\n<pre><code class=\"language-bash\"># vLLM ba\u015flatma komutu (3 GPU i\u00e7in)\n# --model: Modelin yolunu veya ad\u0131n\u0131 belirtin\n# --tensor-parallel-size: GPU say\u0131s\u0131 (Burada 3 veya 2 denenebilir)\n# --max-model-len: Modelin maksimum uzunlu\u011fu\n# --dtype: Bellek hassasiyeti (float16 veya bfloat16)\n\npython -m vllm.entrypoints.api_server \\\n    --model meta-llama\/Llama-2-70b-hf \\\n    --tensor-parallel-size 3 \\\n    --max-model-len 32768 \\\n    --dtype bfloat16 \\\n    --host 0.0.0.0 \\\n    --port 8000\n<\/code><\/pre>\n<p>Bu komutu \u00e7al\u0131\u015ft\u0131rd\u0131\u011f\u0131n\u0131zda, vLLM&#8217;in loglar\u0131n\u0131 dikkatlice takip etmelisiniz. E\u011fer 3. GPU&#8217;nun PCIe lane darbo\u011faz\u0131 nedeniyle performans\u0131 d\u00fc\u015ferse, <code>--tensor-parallel-size<\/code> de\u011ferini 2&#8217;ye d\u00fc\u015f\u00fcrmeyi ve 3. GPU&#8217;yu sadece bellek olarak kullanmay\u0131 deneyebilirsiniz. Bu, sisteminizin en h\u0131zl\u0131 GPU&#8217;lar\u0131n\u0131 daha verimli kullanman\u0131z\u0131 sa\u011flar.<\/p>\n<h2 id=\"llamacpp-ile-hafif-ve-esnek-cerceve\">llama.cpp ile Hafif ve Esnek \u00c7er\u00e7eve<\/h2>\n<p>llama.cpp, yerel yapay zeka d\u00fcnyas\u0131nda CPU ve GPU kar\u0131\u015f\u0131k kullan\u0131m (offload) konusunda en esnek \u00e7\u00f6z\u00fcmlerden biridir. TRX40 ve 3x RTX 3090 konfig\u00fcrasyonunda, llama.cpp&#8217;in en b\u00fcy\u00fck avantaj\u0131, modelin katmanlar\u0131n\u0131 istedi\u011finiz gibi GPU&#8217;lara ve RAM&#8217;e da\u011f\u0131tabilmesidir. vLLM&#8217;e k\u0131yasla daha az otomatizm sunar ancak daha fazla kontrol sa\u011flar. \u00d6zellikle bellek s\u0131n\u0131rlamalar\u0131 olan durumlarda, llama.cpp&#8217;in &#8220;CPU offload&#8221; \u00f6zelli\u011fi, modelin tamamen GPU&#8217;ya s\u0131\u011fmamas\u0131 durumunda sistemin \u00e7\u00f6kmesini engeller.<\/p>\n<p>llama.cpp, modelin katmanlar\u0131n\u0131 (layers) GPU&#8217;lara y\u00fckler ve kalan katmanlar\u0131 RAM&#8217;de (CPU) tutar. Bu, bellek kapasitesini art\u0131r\u0131r ancak i\u015flem h\u0131z\u0131n\u0131 d\u00fc\u015f\u00fcr\u00fcr. TRX40 konfig\u00fcrasyonunda, 3x 3090&#8217;\u0131n toplam 72GB VRAM&#8217;i, 70B parametreli modelleri 4-bit veya 8-bit quantizasyon ile \u00e7al\u0131\u015ft\u0131rmak i\u00e7in idealdir. Ancak llama.cpp, modelin katmanlar\u0131n\u0131 da\u011f\u0131t\u0131rken, hangi katmanlar\u0131n GPU&#8217;da, hangilerinin CPU&#8217;da olaca\u011f\u0131n\u0131 sizin belirlemenize izin verir. Bu, PCIe lane darbo\u011faz\u0131n\u0131 minimize etmek i\u00e7in stratejik bir avantajd\u0131r. \u00d6rne\u011fin, yo\u011fun veri trafi\u011fi gerektiren katmanlar\u0131 h\u0131zl\u0131 GPU&#8217;lara (1. ve 2. slot) y\u00fckleyip, daha az kritik katmanlar\u0131 yava\u015f GPU&#8217;ya (3. slot) veya RAM&#8217;e b\u0131rakabilirsiniz.<\/p>\n<p><strong>llama.cpp Konfig\u00fcrasyonu \u0130\u00e7in Stratejiler:<\/strong><\/p>\n<ul>\n<li><strong>GPU Offload Ayar\u0131:<\/strong> <code>n_gpu_layers<\/code> parametresi, GPU&#8217;ya y\u00fcklenecek katman say\u0131s\u0131n\u0131 belirler. Bu de\u011feri sisteminizin VRAM kapasitesine ve PCIe lane da\u011f\u0131l\u0131m\u0131na g\u00f6re ayarlamal\u0131s\u0131n\u0131z.<\/li>\n<li><strong>Model Split Stratejisi:<\/strong> llama.cpp, modelin katmanlar\u0131n\u0131 GPU&#8217;lara da\u011f\u0131t\u0131rken, 3. GPU&#8217;nun yava\u015fl\u0131\u011f\u0131 nedeniyle bu da\u011f\u0131l\u0131m\u0131 dengeli yapman\u0131z gerekebilir.<\/li>\n<li><strong>Bellek Y\u00f6netimi:<\/strong> llamacpp, bellek y\u00f6netimi i\u00e7in <code>n_ctx<\/code> (ba\u011flam uzunlu\u011fu) ve <code>n_batch<\/code> gibi parametreleri hassas bir \u015fekilde ayarlaman\u0131z\u0131 sa\u011flar. Bu, \u00f6zellikle uzun ba\u011flamlarla \u00e7al\u0131\u015f\u0131rken performans\u0131 etkiler.<\/li>\n<\/ul>\n<p>llama.cpp ile modelin hangi GPU&#8217;ya y\u00fcklenece\u011fini ayarlamak i\u00e7in Python kodu kullanabilirsiniz. Bu kod, modelin katmanlar\u0131n\u0131 GPU&#8217;lara da\u011f\u0131tma stratejisini g\u00f6sterir.<\/p>\n<pre><code class=\"language-python\"># llama.cpp Python API kullanarak model y\u00fckleme\n# n_gpu_layers: GPU'ya y\u00fcklenecek katman say\u0131s\u0131\n# n_ctx: Ba\u011flam uzunlu\u011fu\n# n_batch: \u0130\u015flemci batch boyutu\n\nimport llama_cpp\n\n# Modeli y\u00fckleme\nllm = llama_cpp.Llama(\n    model_path=\".\/llama-2-70b.gguf\",\n    n_ctx=32768,          # Ba\u011flam uzunlu\u011fu\n    n_batch=1024,         # \u0130\u015flemci batch boyutu\n    n_gpu_layers=99,      # GPU'ya y\u00fcklenecek katman say\u0131s\u0131 (max de\u011fer)\n    offload_kqv=True,     # K-V cache'i GPU'ya y\u00fckle\n    tensor_split=[0.5, 0.5, 0.0] # 3 GPU'ya da\u011f\u0131l\u0131m (3. GPU'yu atlamak i\u00e7in)\n)\n\n# Sorgu \u00e7al\u0131\u015ft\u0131rma\noutput = llm(\n    \"Sormak istedi\u011finiz soruyu buraya yaz\u0131n...\",\n    max_tokens=256,\n    stop=[\"USER:\", \"ASSISTANT:\"],\n    echo=True\n)\n\nprint(output['choices'][0]['text'])\n<\/code><\/pre>\n<p>Bu kodda <code>tensor_split<\/code> parametresi, 3. GPU&#8217;nun darbo\u011faz\u0131n\u0131 minimize etmek i\u00e7in 0.0 olarak ayarlanm\u0131\u015ft\u0131r. Bu, modelin katmanlar\u0131n\u0131 sadece ilk iki GPU&#8217;ya y\u00fcklerken, 3. GPU&#8217;yu sadece ek bellek olarak kullanmakt\u0131r. Bu strateji, TRX40 konfig\u00fcrasyonunda llama.cpp&#8217;in en verimli kullan\u0131m\u0131n\u0131 sa\u011flar.<\/p>\n<h2 id=\"ollama-ile-kullanici-dostu-yonetim-ve-entegrasyon\">Ollama ile Kullan\u0131c\u0131 Dostu Y\u00f6netim ve Entegrasyon<\/h2>\n<p>Ollama, yerel yapay zeka modellerini \u00e7al\u0131\u015ft\u0131rmak i\u00e7in tasarlanm\u0131\u015f, kurulumu ve kullan\u0131m\u0131 en kolay ara\u00e7lardan biridir. TRX40 ve 3x RTX 3090 konfig\u00fcrasyonunda Ollama, otomatik olarak GPU&#8217;lar\u0131 tespit eder ve modelin katmanlar\u0131n\u0131 da\u011f\u0131t\u0131r. Ancak, vLLM ve llama.cpp gibi ara\u00e7lar kadar ince ayar imkan\u0131 sunmaz. Ollama&#8217;n\u0131n temel avantaj\u0131, kullan\u0131c\u0131n\u0131n teknik detaylarla u\u011fra\u015fmadan modelleri \u00e7al\u0131\u015ft\u0131rabilmesidir.<\/p>\n<p>Ollama, modelleri otomatik olarak GPU&#8217;lara y\u00fckler ve bellek y\u00f6netimini kendi i\u00e7inde halleder. TRX40 konfig\u00fcrasyonunda, Ollama&#8217;n\u0131n otomatik da\u011f\u0131t\u0131m\u0131, PCIe lane darbo\u011faz\u0131n\u0131 optimize etmeyebilir. Ancak, kullan\u0131c\u0131n\u0131n manuel ayar yapmas\u0131 gerekmeden, sistemin h\u0131zl\u0131ca \u00e7al\u0131\u015fmas\u0131n\u0131 sa\u011flar. Bu, \u00f6zellikle yeni ba\u015flayanlar veya s\u00fcrekli deneme yapma ihtiyac\u0131 olmayan kullan\u0131c\u0131lar i\u00e7in idealdir.<\/p>\n<p><strong>Ollama Kullan\u0131m\u0131 \u0130\u00e7in Stratejiler:<\/strong><\/p>\n<ul>\n<li><strong>Otomatik GPU Tespiti:<\/strong> Ollama, sistemindeki GPU&#8217;lar\u0131 otomatik olarak tespit eder ve modelin katmanlar\u0131n\u0131 da\u011f\u0131t\u0131r.<\/li>\n<li><strong>Model Y\u00fckleme:<\/strong> Ollama, modelleri do\u011frudan GitHub reposundan \u00e7eker ve y\u00fckleme yapar.<\/li>\n<li><strong>API Entegrasyonu:<\/strong> Ollama, yerel bir API sunar, bu sayede di\u011fer uygulamalarla kolayca entegre edilebilir.<\/li>\n<\/ul>\n<p>Ollama&#8217;y\u0131 \u00e7al\u0131\u015ft\u0131rmak i\u00e7in komut sat\u0131r\u0131 \u015fu \u015fekildedir:<\/p>\n<pre><code class=\"language-bash\"># Ollama ba\u015flatma\nollama serve\n\n# Model \u00e7ekme ve \u00e7al\u0131\u015fma\nollama pull llama2:70b\nollama run llama2:70b\n<\/code><\/pre>\n<p>Bu komut, Ollama&#8217;n\u0131n modeli otomatik olarak GPU&#8217;lara y\u00fckleyip \u00e7al\u0131\u015ft\u0131rmas\u0131n\u0131 sa\u011flar. Ancak, TRX40 konfig\u00fcrasyonunda, Ollama&#8217;n\u0131n otomatik da\u011f\u0131t\u0131m\u0131, PCIe lane darbo\u011faz\u0131n\u0131 optimize etmeyebilir. Bu durumda, kullan\u0131c\u0131n\u0131n manuel ayar yapmas\u0131 gerekmeden, sistemin h\u0131zl\u0131ca \u00e7al\u0131\u015fmas\u0131n\u0131 sa\u011flar.<\/p>\n<h2 id=\"karsilastirma-vllm-llamacpp-ve-ollama-arasindaki-farklar\">Kar\u015f\u0131la\u015ft\u0131rma: vLLM, llama.cpp ve Ollama Aras\u0131ndaki Farklar<\/h2>\n<p>Bu \u00fc\u00e7 ara\u00e7, TRX40 ve 3x RTX 3090 konfig\u00fcrasyonunda farkl\u0131 kullan\u0131m senaryolar\u0131na hitap eder. vLLM, y\u00fcksek performans ve d\u00fc\u015f\u00fck gecikme (low latency) gerektiren i\u015f y\u00fckleri i\u00e7in idealdir. llama.cpp, esneklik ve bellek y\u00f6netimi i\u00e7in tercih edilirken, Ollama ise kolay kullan\u0131m ve h\u0131zl\u0131 kurulum i\u00e7in \u00f6ne \u00e7\u0131kar.<\/p>\n<table>\n<thead>\n<tr>\n<th>\u00d6zellik<\/th>\n<th>vLLM<\/th>\n<th>llama.cpp<\/th>\n<th>Ollama<\/th>\n<\/tr>\n<\/thead>\n<tbody>\n<tr>\n<td><strong>Performans<\/strong><\/td>\n<td>Y\u00fcksek (Optimize edilmi\u015f)<\/td>\n<td>Orta-Y\u00fcksek (Ayarlara ba\u011fl\u0131)<\/td>\n<td>Orta<\/td>\n<\/tr>\n<tr>\n<td><strong>Bellek Y\u00f6netimi<\/strong><\/td>\n<td>Otomatik (PagedAttention)<\/td>\n<td>Manuel (Offload ayarlar\u0131)<\/td>\n<td>Otomatik<\/td>\n<\/tr>\n<tr>\n<td><strong>Kurulum Kolayl\u0131\u011f\u0131<\/strong><\/td>\n<td>Orta<\/td>\n<td>Zor<\/td>\n<td>\u00c7ok Kolay<\/td>\n<\/tr>\n<tr>\n<td><strong>PCIe Darbo\u011faz\u0131<\/strong><\/td>\n<td>Y\u00fcksek (Manuel ayar gerekir)<\/td>\n<td>Orta (Stratejik ayar)<\/td>\n<td>D\u00fc\u015f\u00fck (Otomatik)<\/td>\n<\/tr>\n<tr>\n<td><strong>Kullan\u0131m Senaryosu<\/strong><\/td>\n<td>Y\u00fcksek trafikli API, Chatbot<\/td>\n<td>Model e\u011fitimi, \u00f6zel da\u011f\u0131t\u0131m<\/td>\n<td>H\u0131zl\u0131 deneme, prototipleme<\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<p>Bu tablo, hangi arac\u0131n hangi senaryoda daha uygun oldu\u011funu g\u00f6sterir. vLLM, y\u00fcksek performansl\u0131 bir API sunmak i\u00e7in idealdir. llama.cpp, modelin katmanlar\u0131n\u0131 manuel olarak da\u011f\u0131tmak isteyenler i\u00e7in uygundur. Ollama ise h\u0131zl\u0131 kurulum ve kolay kullan\u0131m isteyenler i\u00e7in en iyi se\u00e7enektir.<\/p>\n<h2 id=\"sisteminiz-icin-elektrik-ve-isi-maliyet-analizi\">Sisteminiz \u0130\u00e7in Elektrik ve Is\u0131 Maliyet Analizi<\/h2>\n<p>TRX40 ve 3x RTX 3090 konfig\u00fcrasyonunun en b\u00fcy\u00fck maliyeti, elektrik ve \u0131s\u0131d\u0131r. Bu maliyet, sisteminizin \u00e7al\u0131\u015fma s\u00fcresine ve y\u00fck\u00fcne g\u00f6re de\u011fi\u015fir. Ancak, sisteminizin ortalama elektrik t\u00fcketimini hesaplamak ve bu maliyeti \u00f6ng\u00f6rmek i\u00e7in bir form\u00fcl kullanabilirsiniz.<\/p>\n<p><strong>Elektrik T\u00fcketimi Form\u00fcl\u00fc:<\/strong><\/p>\n<pre><code>Ayl\u0131k Maliyet = (Toplam G\u00fc\u00e7 kW) x (G\u00fcnl\u00fck \u00c7al\u0131\u015fma Saati) x (30 G\u00fcn) x (Yerel Birim Fiyat)\n<\/code><\/pre>\n<p>TRX40 ve 3x RTX 3090 konfig\u00fcrasyonu i\u00e7in:<br \/>\n*   Toplam G\u00fc\u00e7: ~1.5kW &#8211; 1.7kW (Tam y\u00fck)<br \/>\n*   G\u00fcnl\u00fck \u00c7al\u0131\u015fma Saati: 8-12 saat (Ortalama kullan\u0131m)<br \/>\n*   Yerel Birim Fiyat: 1.5 TL\/kWh (\u00d6rnek)<\/p>\n<p>Bu form\u00fcle g\u00f6re:<\/p>\n<pre><code>Ayl\u0131k Maliyet = 1.6kW x 10 saat x 30 g\u00fcn x 1.5 TL = 720 TL\n<\/code><\/pre>\n<p>Bu maliyet, sisteminizin sadece elektrik t\u00fcketimini g\u00f6sterir. Ancak, so\u011futma i\u00e7in kullan\u0131lan fanlar ve di\u011fer ek maliyetler de bu tabloya eklenmelidir. Ayr\u0131ca, sisteminizin \u0131s\u0131 \u00fcretimi, odan\u0131z\u0131n s\u0131cakl\u0131\u011f\u0131n\u0131 art\u0131r\u0131r. Bu da, yaz\u0131n klima maliyetlerini art\u0131rabilir. Bu nedenle, sisteminizin \u0131s\u0131 y\u00f6netimi ve so\u011futma stratejileri, maliyet analizi a\u00e7\u0131s\u0131ndan \u00e7ok \u00f6nemlidir.<\/p>\n<p><strong>Elektrik T\u00fcketimi ve Tahmini Maliyet Analizi Tablosu<\/strong><\/p>\n<table>\n<thead>\n<tr>\n<th>Durum<\/th>\n<th>GPU G\u00fcc\u00fc (3x)<\/th>\n<th>CPU G\u00fcc\u00fc<\/th>\n<th>Toplam G\u00fc\u00e7 (kW)<\/th>\n<th>G\u00fcnl\u00fck T\u00fcketim (kWh)<\/th>\n<th>Ayl\u0131k Maliyet (TL)*<\/th>\n<\/tr>\n<\/thead>\n<tbody>\n<tr>\n<td><strong>Idle<\/strong><\/td>\n<td>150W<\/td>\n<td>50W<\/td>\n<td>0.2kW<\/td>\n<td>4.8 kWh<\/td>\n<td>~22 TL<\/td>\n<\/tr>\n<tr>\n<td><strong>Orta Y\u00fck<\/strong><\/td>\n<td>600W<\/td>\n<td>150W<\/td>\n<td>0.8kW<\/td>\n<td>19.2 kWh<\/td>\n<td>~86 TL<\/td>\n<\/tr>\n<tr>\n<td><strong>Tam Y\u00fck<\/strong><\/td>\n<td>1200W<\/td>\n<td>300W<\/td>\n<td>1.5kW<\/td>\n<td>36.0 kWh<\/td>\n<td>~162 TL<\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<p><em>Not: 1.5 TL\/kWh fiyat \u00fczerinden hesaplanm\u0131\u015ft\u0131r. Ger\u00e7ek maliyet yerel fiyatlar de\u011fi\u015febilir.<\/em><\/p>\n<p>Bu tablo, sisteminizin farkl\u0131 durumlardaki elektrik t\u00fcketimini ve maliyetini g\u00f6sterir. Tam y\u00fckte \u00e7al\u0131\u015ft\u0131rma maliyeti olduk\u00e7a y\u00fcksektir. Bu nedenle, sisteminizi sadece ihtiya\u00e7 duydu\u011funuzda \u00e7al\u0131\u015ft\u0131rmak veya y\u00fck\u00fc b\u00f6l\u00fcmlere ay\u0131rmak, maliyetleri azaltmak i\u00e7in mant\u0131kl\u0131 bir stratejidir.<\/p>\n<h2 id=\"kurulum-oncesi-kontrol-listesi\">Kurulum \u00d6ncesi Kontrol Listesi<\/h2>\n<p>Sisteminizi kurmadan \u00f6nce, a\u015fa\u011f\u0131daki kontrol listesini tamamlaman\u0131z, kurulum sonras\u0131 sorunlar\u0131 minimize etmek i\u00e7in kritiktir. Bu liste, donan\u0131m\u0131n fiziksel ve yaz\u0131l\u0131sal haz\u0131rl\u0131klar\u0131n\u0131 i\u00e7erir.<\/p>\n<p><strong>Donan\u0131m Kurulumu \u00d6ncesi Kontrol Listesi<\/strong><\/p>\n<ol>\n<li>\u2610 <strong>G\u00fc\u00e7 Kayna\u011f\u0131 Kontrol\u00fc:<\/strong> 1000W+ (\u00d6nerilen 1200W+) kaliteli bir PSU&#8217;nun mevcut oldu\u011fundan emin olun. 3x 3090 + Threadripper i\u00e7in yeterli g\u00fc\u00e7 olmal\u0131d\u0131r.<\/li>\n<li>\u2610 <strong>Is\u0131 Y\u00f6netimi:<\/strong> Kasan\u0131n hava ak\u0131\u015f\u0131n\u0131n (airflow) do\u011fru yap\u0131land\u0131r\u0131ld\u0131\u011f\u0131ndan emin olun. Giri\u015f ve \u00e7\u0131k\u0131\u015f fanlar\u0131 yerinde mi?<\/li>\n<li>\u2610 <strong>PCIe Slot Uyumu:<\/strong> TRX40 anakart\u0131n PCIe slotlar\u0131n\u0131n fiziksel olarak 3x 3090 i\u00e7in yeterli oldu\u011fundan emin olun. Slotlar birbirine \u00e7ok yak\u0131n olabilir, termal etkile\u015fim riskini g\u00f6z \u00f6n\u00fcnde bulundurun.<\/li>\n<li>\u2610 <strong>S\u00fcr\u00fcm Uyumlulu\u011fu:<\/strong> NVIDIA drivers, PyTorch, CUDA s\u00fcr\u00fcmlerinin uyumlu oldu\u011fundan emin olun.<\/li>\n<li>\u2610 <strong>Bellek Y\u00f6netimi:<\/strong> 256GB RAM&#8217;in yeterli oldu\u011fundan emin olun. Model y\u00fckleme s\u0131ras\u0131nda RAM yetersizli\u011fi ya\u015fanabilir.<\/li>\n<li>\u2610 <strong>So\u011futma S\u0131v\u0131s\u0131:<\/strong> E\u011fer su so\u011futma kullanacaksan\u0131z, s\u0131z\u0131nt\u0131 riskini minimize etmek i\u00e7in gerekli \u00f6nlemleri al\u0131n.<\/li>\n<li>\u2610 <strong>G\u00fc\u00e7 Kayna\u011f\u0131 Ba\u011flant\u0131s\u0131:<\/strong> GPU&#8217;lara yeterli PCIe g\u00fc\u00e7 kablosu ba\u011fland\u0131\u011f\u0131ndan emin olun. Her GPU i\u00e7in ayr\u0131 kablo kullanmak daha iyidir.<\/li>\n<li>\u2610 <strong>Termal Macun:<\/strong> GPU&#8217;lar\u0131n termal macunlar\u0131n\u0131n taze oldu\u011fundan emin olun. Eski macunlar, \u0131s\u0131y\u0131 aktaramaz.<\/li>\n<li>\u2610 <strong>Sistem Testi:<\/strong> Sistemi \u00e7al\u0131\u015ft\u0131rmadan \u00f6nce, BIOS ayarlar\u0131n\u0131 kontrol edin ve PCIe slotlar\u0131n\u0131n do\u011fru \u00e7al\u0131\u015ft\u0131\u011f\u0131n\u0131 do\u011frulay\u0131n.<\/li>\n<li>\u2610 <strong>Yedekleme:<\/strong> Sistem yede\u011fi al\u0131n veya \u00f6nemli dosyalar\u0131 yedekleyin. Kurulum s\u0131ras\u0131nda veri kayb\u0131 ya\u015fanabilir.<\/li>\n<\/ol>\n<p>Bu kontrol listesini tamamlamak, kurulum sonras\u0131 kar\u015f\u0131la\u015fabilece\u011finiz sorunlar\u0131 minimize etmek i\u00e7in \u00e7ok \u00f6nemlidir. \u00d6zellikle g\u00fc\u00e7 kayna\u011f\u0131 ve \u0131s\u0131 y\u00f6netimi konular\u0131na dikkat etmelisiniz.<\/p>\n<h2 id=\"sistem-kararliligi-ve-bakim-kontrol-listesi\">Sistem Kararl\u0131l\u0131\u011f\u0131 ve Bak\u0131m Kontrol Listesi<\/h2>\n<p>Sisteminizi kurduktan sonra, uzun s\u00fcreli kararl\u0131l\u0131k ve performans i\u00e7in d\u00fczenli bak\u0131mlar yapman\u0131z gerekir. Bu liste, sisteminizdeki potansiyel sorunlar\u0131 erken tespit etmek ve \u00f6nlemek i\u00e7in kritiktir.<\/p>\n<p><strong>Sistem Kararl\u0131l\u0131\u011f\u0131 ve Bak\u0131m Kontrol Listesi<\/strong><\/p>\n<ol>\n<li>\u2610 <strong>G\u00fcnl\u00fck Is\u0131 Kontrol\u00fc:<\/strong> Her g\u00fcn, GPU ve CPU s\u0131cakl\u0131klar\u0131n\u0131 kontrol edin. 85\u00b0C \u00fczeri s\u0131cakl\u0131klar alarm verin.<\/li>\n<li>\u2610 <strong>Fan Kontrol\u00fc:<\/strong> Fanlar\u0131n d\u00fczg\u00fcn \u00e7al\u0131\u015ft\u0131\u011f\u0131ndan ve tozlanma olmad\u0131\u011f\u0131ndan emin olun. Toz, \u0131s\u0131y\u0131 art\u0131r\u0131r.<\/li>\n<li>\u2610 <strong>Yaz\u0131l\u0131m G\u00fcncellemeleri:<\/strong> NVIDIA drivers, PyTorch, vLLM ve di\u011fer yaz\u0131l\u0131mlar\u0131n g\u00fcncel oldu\u011fundan emin olun.<\/li>\n<li>\u2610 <strong>Bellek Testi:<\/strong> RAM&#8217;lerde hata olup olmad\u0131\u011f\u0131n\u0131 kontrol edin. ECC RAM kullan\u0131yorsan\u0131z, hata kay\u0131tlar\u0131n\u0131 inceleyin.<\/li>\n<li>\u2610 <strong>Termal Macun De\u011fi\u015fimi:<\/strong> Y\u0131lda bir kez, GPU&#8217;lar\u0131n termal macunlar\u0131n\u0131 de\u011fi\u015ftirmeyi planlay\u0131n.<\/li>\n<li>\u2610 <strong>G\u00fc\u00e7 Kayna\u011f\u0131 Kontrol\u00fc:<\/strong> PSU&#8217;nun voltaj\u0131n\u0131n stabil oldu\u011fu ve kablolar\u0131n gev\u015femedi\u011fi kontrol edin.<\/li>\n<li>\u2610 <strong>Is\u0131 Ak\u0131\u015f\u0131 Testi:<\/strong> Sistemde hava ak\u0131\u015f\u0131n\u0131n d\u00fczg\u00fcn \u00e7al\u0131\u015ft\u0131\u011f\u0131ndan emin olun.<\/li>\n<li>\u2610 <strong>Yaz\u0131l\u0131m Loglar\u0131:<\/strong> Uygulamalar\u0131n loglar\u0131n\u0131 d\u00fczenli olarak inceleyin. Hatalar\u0131 tespit edin.<\/li>\n<li>\u2610 <strong>Yedekleme:<\/strong> Model dosyalar\u0131n\u0131 ve konfig\u00fcrasyonlar\u0131 d\u00fczenli olarak yedekleyin.<\/li>\n<li>\u2610 <strong>Sistem Performans\u0131:<\/strong> Sistemin performans\u0131n\u0131 d\u00fczenli olarak test edin. D\u00fc\u015f\u00fck performans, donan\u0131m sorununa i\u015faret edebilir.<\/li>\n<\/ol>\n<p>Bu liste, sisteminizin uzun s\u00fcreli kararl\u0131l\u0131\u011f\u0131n\u0131 sa\u011flamak i\u00e7in kritiktir. \u00d6zellikle \u0131s\u0131 y\u00f6netimi ve yaz\u0131l\u0131m g\u00fcncellemeleri konular\u0131na dikkat etmelisiniz.<\/p>\n<h2 id=\"sorun-giderme-ve-yaygin-hatalar\">Sorun Giderme ve Yayg\u0131n Hatalar<\/h2>\n<p>3x RTX 3090 ve TRX40 konfig\u00fcrasyonunda, kar\u015f\u0131la\u015fabilece\u011finiz yayg\u0131n hatalar ve \u00e7\u00f6z\u00fcmleri \u015funlard\u0131r:<\/p>\n<ol>\n<li><strong>PCIe Lane Darbo\u011faz\u0131:<\/strong> 3. GPU x4 h\u0131z\u0131nda \u00e7al\u0131\u015f\u0131yorsa, modelin da\u011f\u0131t\u0131m\u0131 yava\u015flayabilir. \u00c7\u00f6z\u00fcm: <code>tensor-parallel-size<\/code> parametresini d\u00fc\u015f\u00fcrmek veya modelin da\u011f\u0131t\u0131m\u0131n\u0131 manuel ayarlamak.<\/li>\n<li><strong>Is\u0131 Art\u0131\u015f\u0131:<\/strong> GPU&#8217;lar 85\u00b0C \u00fczerine \u00e7\u0131karsa, sistem kendini korumaya al\u0131r. \u00c7\u00f6z\u00fcm: Fan h\u0131z\u0131n\u0131 art\u0131rmak, ortam s\u0131cakl\u0131\u011f\u0131n\u0131 d\u00fc\u015f\u00fcrmek.<\/li>\n<li><strong>Elektrik T\u00fcketimi:<\/strong> Fatura \u00e7ok y\u00fcksek \u00e7\u0131karsa, sistemi sadece ihtiya\u00e7 duydu\u011funuzda \u00e7al\u0131\u015ft\u0131rmak veya y\u00fck\u00fc b\u00f6l\u00fcmlere ay\u0131rmak.<\/li>\n<li><strong>Yaz\u0131l\u0131m \u00c7\u00f6kmesi:<\/strong> Model y\u00fcklenirken sistem \u00e7\u00f6kerse, RAM yetersizli\u011fi olabilir. \u00c7\u00f6z\u00fcm: RAM miktar\u0131n\u0131 art\u0131rmak veya modelin boyutunu k\u00fc\u00e7\u00fcltmek.<\/li>\n<li><strong>PCIe S\u00fcr\u00fc\u015f Hatas\u0131:<\/strong> PCIe slotlar\u0131nda hata varsa, BIOS ayarlar\u0131n\u0131 kontrol edin.<\/li>\n<li><strong>Termal Macun Sorunu:<\/strong> GPU&#8217;lar \u00e7ok \u0131s\u0131n\u0131rsa, termal macun de\u011fi\u015fimi gerekebilir.<\/li>\n<li><strong>G\u00fc\u00e7 Kayna\u011f\u0131 Yetersizli\u011fi:<\/strong> Sistem a\u00e7\u0131lm\u0131yorsa, PSU yetersiz olabilir. \u00c7\u00f6z\u00fcm: Daha y\u00fcksek watt&#8217;l\u0131 PSU kullanmak.<\/li>\n<\/ol>\n<p>Bu hatalar\u0131n \u00e7o\u011fu, donan\u0131m\u0131n fiziksel s\u0131n\u0131rlamalar\u0131ndan kaynaklan\u0131r. Bu s\u0131n\u0131rlamalar\u0131 kabul edip, yaz\u0131l\u0131m\u0131n bu s\u0131n\u0131rlar i\u00e7inde \u00e7al\u0131\u015fmas\u0131n\u0131 sa\u011flamak, sisteminiz i\u00e7in en iyi stratejidir.<\/p>\n<h2 id=\"sikca-sorulan-sorular-sss\">S\u0131k\u00e7a Sorulan Sorular (SSS)<\/h2>\n<h3 id=\"1-3x-3090-sistemi-trx40-uzerinde-ne-kadar\">1. 3x 3090 sistemi TRX40 \u00fczerinde ne kadar \u0131s\u0131n\u0131r?<\/h3>\n<p>Bu konfig\u00fcrasyon, ev ortam\u0131nda ciddi bir \u0131s\u0131 kayna\u011f\u0131d\u0131r. Tam y\u00fck alt\u0131nda, GPU s\u0131cakl\u0131klar\u0131 75\u00b0C ile 85\u00b0C aras\u0131nda de\u011fi\u015febilir. Ortam s\u0131cakl\u0131\u011f\u0131 30\u00b0C&#8217;nin \u00fczerindeyse, bu de\u011ferler daha da artabilir. Is\u0131y\u0131 kontrol etmek i\u00e7in fan h\u0131z\u0131n\u0131 art\u0131rmak ve ortam s\u0131cakl\u0131\u011f\u0131n\u0131 d\u00fc\u015f\u00fcrmek kritiktir. Is\u0131, donan\u0131m\u0131n \u00f6mr\u00fcn\u00fc k\u0131salt\u0131r ve performans\u0131 d\u00fc\u015f\u00fcr\u00fcr.<\/p>\n<h3 id=\"2-vllm-ve-llamacpp-arasinda-bu-donanimda-fark\">2. vLLM ve llama.cpp aras\u0131nda bu donan\u0131mda fark nedir?<\/h3>\n<p>vLLM, y\u00fcksek performans ve d\u00fc\u015f\u00fck gecikme i\u00e7in tasarlanm\u0131\u015ft\u0131r. TRX40 konfig\u00fcrasyonunda, vLLM&#8217;in otomatik da\u011f\u0131t\u0131m\u0131, PCIe lane darbo\u011faz\u0131n\u0131 optimize etmeyebilir. llama.cpp ise, modelin katmanlar\u0131n\u0131 manuel olarak da\u011f\u0131tman\u0131za izin verir. Bu, 3. GPU&#8217;nun darbo\u011faz\u0131n\u0131 minimize etmek i\u00e7in stratejik bir avantajd\u0131r. vLLM daha otomatik, llama.cpp daha esnektir.<\/p>\n<h3 id=\"3-elektrik-faturasi-ne-kadar-artar\">3. Elektrik faturas\u0131 ne kadar artar?<\/h3>\n<p>Sisteminiz tam y\u00fckte \u00e7al\u0131\u015ft\u0131\u011f\u0131nda, ayl\u0131k elektrik faturas\u0131 700 TL ile 1800 TL aras\u0131nda de\u011fi\u015febilir. Bu, sisteminizin \u00e7al\u0131\u015fma s\u00fcresine ve y\u00fck\u00fcne ba\u011fl\u0131d\u0131r. So\u011futma maliyetleri (klimalar) de bu tabloya eklenebilir. Bu maliyet, sisteminizin &#8220;k\u00e2r-zarar&#8221; noktas\u0131n\u0131 belirler.<\/p>\n<h3 id=\"4-pcie-x4x8-darbogazi-performansimi-ne-kadar-etkiler\">4. PCIe x4\/x8 darbo\u011faz\u0131 performans\u0131m\u0131 ne kadar etkiler?<\/h3>\n<ol>\n<li>GPU x4 h\u0131z\u0131nda \u00e7al\u0131\u015f\u0131yorsa, modelin da\u011f\u0131t\u0131m\u0131 yava\u015flayabilir. Bu, \u00f6zellikle vLLM gibi otomatik da\u011f\u0131t\u0131m yapan ara\u00e7larda fark edilir. llama.cpp gibi ara\u00e7larda, manuel ayarlarla bu darbo\u011faz minimize edilebilir. Performans d\u00fc\u015f\u00fc\u015f\u00fc, modelin par\u00e7alama stratejisine ba\u011fl\u0131d\u0131r.<\/li>\n<\/ol>\n<h3 id=\"5-bu-sistemle-70b-parametreli-modeli-calistirabilir-miyim\">5. Bu sistemle 70B parametreli modeli \u00e7al\u0131\u015ft\u0131rabilir miyim?<\/h3>\n<p>Evet, 3x 3090&#8217;\u0131n 72GB VRAM&#8217;i, 70B parametreli modeli 4-bit veya 8-bit quantizasyon ile \u00e7al\u0131\u015ft\u0131rmak i\u00e7in yeterlidir. Ancak, PCIe lane darbo\u011faz\u0131 nedeniyle, modelin da\u011f\u0131t\u0131m\u0131 yava\u015flayabilir. Bu durumda, llama.cpp veya vLLM&#8217;in manuel ayarlar\u0131n\u0131 kullanmak gerekebilir.<\/p>\n<h2 id=\"sonuc-ve-ileri-adim-onerileri\">Sonu\u00e7 ve \u0130leri Ad\u0131m \u00d6nerileri<\/h2>\n<p>3x RTX 3090 ve TRX40 konfig\u00fcrasyonu, yerel yapay zeka d\u00fcnyas\u0131nda g\u00fc\u00e7l\u00fc bir se\u00e7enektir. Ancak, bu donan\u0131m\u0131n s\u0131n\u0131rlamalar\u0131n\u0131 kabul etmek ve bunlar\u0131 y\u00f6netmek, sisteminiz i\u00e7in en \u00f6nemli ad\u0131md\u0131r. Elektrik maliyeti, \u0131s\u0131 y\u00f6netimi ve PCIe lane darbo\u011fazlar\u0131, sisteminizin en kritik noktalar\u0131d\u0131r.<\/p>\n<p><strong>\u0130leri Ad\u0131m \u00d6nerileri:<\/strong><\/p>\n<ol>\n<li><strong>Performans Testleri:<\/strong> Sisteminizi farkl\u0131 y\u00fcklerle test edin. vLLM, llama.cpp ve Ollama&#8217;n\u0131n performans\u0131n\u0131 kar\u015f\u0131la\u015ft\u0131r\u0131n.<\/li>\n<li><strong>Maliyet Analizi:<\/strong> Elektrik maliyetinizi hesaplay\u0131n ve sisteminizin &#8220;k\u00e2r-zarar&#8221; noktas\u0131n\u0131 belirleyin.<\/li>\n<li><strong>Is\u0131 Y\u00f6netimi:<\/strong> Termal y\u00f6netiminizi optimize edin. Fan h\u0131z\u0131n\u0131 art\u0131rmak, ortam s\u0131cakl\u0131\u011f\u0131n\u0131 d\u00fc\u015f\u00fcrmek.<\/li>\n<li><strong>Yaz\u0131l\u0131m G\u00fcncellemeleri:<\/strong> S\u00fcr\u00fcmleri g\u00fcncel tutun. Yeni s\u00fcr\u00fcmler, performans art\u0131rabilir.<\/li>\n<li><strong>Bak\u0131m:<\/strong> D\u00fczenli bak\u0131m yap\u0131n. Termal macun de\u011fi\u015fimi, temizlik.<\/li>\n<\/ol>\n<p>Bu sistem, ev ortam\u0131nda yerel yapay zeka sunucusu kurmak i\u00e7in en g\u00fc\u00e7l\u00fc se\u00e7eneklerden biridir. Ancak, maliyet ve \u0131s\u0131 y\u00f6netimi konusunda dikkatli olmal\u0131s\u0131n\u0131z. Sisteminizi y\u00f6netirken, donan\u0131m\u0131n s\u0131n\u0131rlamalar\u0131n\u0131 kabul edip, yaz\u0131l\u0131m\u0131n bu s\u0131n\u0131rlar i\u00e7inde \u00e7al\u0131\u015fmas\u0131n\u0131 sa\u011flamak, en iyi stratejidir.<\/p>\n<p><strong>\u00d6nerilen Okuma:<\/strong><br \/>\n*   <a href=\"#\">Yerel LLM Kurulum Rehberi<\/a><br \/>\n*   <a href=\"#\">Donan\u0131m So\u011futma Temelleri<\/a><br \/>\n*   <a href=\"#\">GPU Bellek Y\u00f6netimi K\u0131lavuzu<\/a><br \/>\n*   <a href=\"#\">Elektrik Maliyeti Hesaplama Arac\u0131<\/a><\/p>\n<p><strong>Kaynaklar:<\/strong><br \/>\n*   <a href=\"https:\/\/developer.nvidia.com\/cuda-toolkit\" rel=\"noopener noreferrer\" target=\"_blank\">NVIDIA CUDA Dok\u00fcmantasyonu<\/a><br \/>\n*   <a href=\"https:\/\/github.com\/ggerganov\/llama.cpp\" rel=\"noopener noreferrer\" target=\"_blank\">llama.cpp Resmi GitHub Reposu<\/a><br \/>\n*   <a href=\"https:\/\/vllm.readthedocs.io\/\" rel=\"noopener noreferrer\" target=\"_blank\">vLLM Resmi Dok\u00fcmantasyonu<\/a><br \/>\n*   <a href=\"https:\/\/www.asus.com\/tr\/support\/\" rel=\"noopener noreferrer\" target=\"_blank\">ASUS TRX40-Pro Kullan\u0131m K\u0131lavuzu<\/a><\/p>\n<p>Bu rehber, sisteminizi kurarken ve y\u00f6netirken size rehberlik edecektir. Ba\u015far\u0131lar dileriz.<\/p>\n<h3 id=\"zorlu-senaryo-24-saatlik-surekli-yuk-ve-model\">Zorlu Senaryo: 24 Saatlik S\u00fcrekli Y\u00fck ve Model Swapping<\/h3>\n<p>TRX40 platformu ve 3x 3090 konfig\u00fcrasyonu teorik olarak g\u00fc\u00e7l\u00fc olsa da, pratikte en b\u00fcy\u00fck d\u00fc\u015fman\u0131n\u0131z donan\u0131m\u0131n kendisi de\u011fil, s\u00fcreklilik \u00fczerindeki bask\u0131d\u0131r. Yerel yapay zeka sunucusu, ofis saatiyle \u00e7al\u0131\u015ft\u0131\u011f\u0131nda sorun \u00e7\u0131karmaz; ancak modelin s\u00fcrekli bellekte kalmas\u0131 gereken ve 7\/24 veri i\u015fleyen bir ortamda, durum de\u011fi\u015fir.<\/p>\n<p>Ger\u00e7ek senaryo \u015fu \u015fekilde i\u015fler: Sabah saatlerinde vLLM \u00fczerinden 200 ki\u015filik bir ekibin ayn\u0131 anda 7B veya 13B modellerle \u00e7al\u0131\u015fmas\u0131n\u0131 sa\u011fl\u0131yorsunuz. GPU&#8217;lar %100 kullan\u0131mda, bellek dolulu\u011fu %95 civar\u0131nda. \u00d6\u011fleden sonra, farkl\u0131 bir ekibin daha b\u00fcy\u00fck bir model (\u00f6rne\u011fin 30B veya 70B, quantize edilmi\u015f) gerektirmesiyle, mevcut model bellekten silinip yenisinin y\u00fcklenmesi (swapping) zorunlulu\u011fu do\u011far. 256 GB RAM bu senaryoda bir kurtar\u0131c\u0131 de\u011fil, bir ge\u00e7i\u015f noktas\u0131d\u0131r. DDR4&#8217;\u00fcn bant geni\u015fli\u011fi s\u0131n\u0131rlamas\u0131 nedeniyle, PCIe lane da\u011f\u0131l\u0131m\u0131nda olu\u015fan x8\/x4 darbo\u011fazlar\u0131, modelin ana bellekten VRAM&#8217;a aktar\u0131lma s\u00fcresini kritikle\u015ftirir.<\/p>\n<p>\u0130lk 15 dakikada termal koruma devreye girer. 3x 3090, TRX40 \u00fczerindeki yo\u011fun PCIe trafi\u011fi ve i\u015flemci y\u00fck\u00fc nedeniyle tek ba\u015f\u0131na yeterli so\u011futma sa\u011flamaz. Odak noktas\u0131, GPU s\u0131cakl\u0131klar\u0131n\u0131n 85\u00b0C&#8217;nin \u00fczerine \u00e7\u0131kmamas\u0131 de\u011fil, <em>hotspot<\/em> de\u011ferlerinin 105\u00b0C&#8217;yi a\u015fmamas\u0131d\u0131r. ASUS Prime TRX40-Pro&#8217;nun VRM (Voltaj Reg\u00fclasyon Mod\u00fcl\u00fc) yap\u0131s\u0131, bu yo\u011funlukta s\u00fcrekli y\u00fck alt\u0131nda 70\u00b0C&#8217;yi g\u00f6rmezden gelebilir. Ancak, kabin i\u00e7indeki hava ak\u0131\u015f\u0131 bozuldu\u011funda, 3. kart\u0131n hava \u00e7\u0131k\u0131\u015f\u0131 1. kart\u0131n giri\u015fine \u00e7arparak &#8220;\u0131s\u0131 geri d\u00f6n\u00fc\u015f\u00fcm\u00fc&#8221; yarat\u0131r.<\/p>\n<p>Bu senaryoda ba\u015far\u0131s\u0131zl\u0131\u011f\u0131n ilk g\u00f6stergesi performans d\u00fc\u015f\u00fc\u015f\u00fc de\u011fil, sessizlik de\u011fil, &#8220;Yerel Model Yan\u0131t\u0131 Hatas\u0131&#8221;d\u0131r. bellek t\u00fckendi\u011finde, RAM&#8217;den VRAM&#8217;a veri aktar\u0131m\u0131 i\u00e7in sistem kulland\u0131\u011f\u0131 taksiyi (PCIe) yava\u015flat\u0131r. vLLM&#8217;in y\u00fcksek TTFT (Time To First Token) avantaj\u0131, bu darbo\u011fazda an\u0131nda kaybolur. Ger\u00e7ek d\u00fcnya y\u00f6neticisi olarak dikkat etmeniz gereken tek \u015fey, sistemin &#8220;\u00e7al\u0131\u015f\u0131p \u00e7al\u0131\u015fmad\u0131\u011f\u0131&#8221; de\u011fil, &#8220;nas\u0131l \u00e7al\u0131\u015ft\u0131\u011f\u0131&#8221;d\u0131r. Modeli de\u011fi\u015ftirmek i\u00e7in beklenen 10 saniyelik bo\u015fluk, asl\u0131nda 45 saniyeye \u00e7\u0131kar.<\/p>\n<p>Bu durumun \u00e7\u00f6z\u00fcm plan\u0131 basittir: S\u00fcrekli \u00e7al\u0131\u015fan senaryolarda model caching&#8217;ini devreye alman\u0131z veya 3. bir kart\u0131 tamamen farkl\u0131 bir i\u015f y\u00fck\u00fc i\u00e7in (\u00f6rne\u011fin video render veya g\u00f6r\u00fcnt\u00fc i\u015fleme) ay\u0131rman\u0131z gerekir. Tek bir 3090 kart\u0131, 24\/7 yapay zeka y\u00fck\u00fc alt\u0131nda bile %100 verimlilikle \u00e7al\u0131\u015fmaz; bu, maliyet analizinizi do\u011frudan etkiler. Elektrik faturas\u0131, sadece &#8220;\u00e7al\u0131\u015fma s\u00fcresi&#8221; ile de\u011fil, &#8220;so\u011futma y\u00fck\u00fc&#8221; ile de artar. 3x 3090, sadece 300W \u00e7eker; kabin i\u00e7i havay\u0131 100W daha so\u011futmak i\u00e7in fanlar\u0131 zorlamak, sisteminizin toplam elektrik t\u00fcketimini %15-20 art\u0131r\u0131r. Bu, &#8220;sadece donan\u0131m&#8221; sorunu de\u011fil, &#8220;enerji maliyeti&#8221; sorunudur.<\/p>\n<h3 id=\"uygulama-oncesi-kontrol-listesi-trx40-ve-3x-3090\">Uygulama \u00d6ncesi Kontrol Listesi: TRX40 ve 3x 3090 Entegrasyonu<\/h3>\n<p>Bu yap\u0131y\u0131 kurmadan \u00f6nce, donan\u0131m\u0131n fiziksel ve yaz\u0131l\u0131msal s\u0131n\u0131rlar\u0131n\u0131 g\u00f6zden ge\u00e7irmeniz gerekir. A\u015fa\u011f\u0131daki maddeler, projenin ba\u015far\u0131s\u0131z olma ihtimalini en aza indirmek i\u00e7in kritik ad\u0131mlard\u0131r.<\/p>\n<p>Donan\u0131m ve Fiziksel Haz\u0131rl\u0131k &#8211; [ ] Kabin Hacmi ve Fan D\u00fczeni: 3x 3090&#8217;\u0131n kal\u0131nl\u0131\u011f\u0131 (genellikle 3 slot) ve uzunlu\u011fu i\u00e7in kabin i\u00e7inde en az 30 cm&#8217;lik ekstra alan b\u0131rak\u0131ld\u0131 m\u0131? Yan fanlar, hava ak\u0131\u015f\u0131n\u0131 do\u011frudan 3. karta y\u00f6nlendiriyor mu? &#8211; [ ] G\u00fc\u00e7 Kayna\u011f\u0131 (PSU) Yeterlili\u011fi: 3x 3090 (toplam ~900W) + Threadripper 3960X (150-280W) + Anakart ve RAM y\u00fck\u00fc i\u00e7in 1600W+ Platinum\/Titanium sertifikal\u0131 bir PSU kullan\u0131ld\u0131 m\u0131? 1200W s\u0131n\u0131r\u0131nda kalmay\u0131n, %80-90 verimlilik aral\u0131\u011f\u0131nda \u00e7al\u0131\u015fmas\u0131 \u015fart. &#8211; [ ] PCIe Slot Yerle\u015fimi:<\/p>\n<p>3x 3090&#8217;\u0131n TRX40 \u00fczerindeki slot da\u011f\u0131l\u0131m\u0131 (x16\/x16\/x4 veya x16\/x8\/x8) kabin i\u00e7i \u0131s\u0131 ak\u0131\u015f\u0131na uygun mu? 3. kart\u0131 m\u00fcmk\u00fcnse en \u00fcst slotta veya en so\u011fuk hizada tutun. &#8211; [ ] So\u011futma \u00c7\u00f6z\u00fcm\u00fc: S\u0131v\u0131 so\u011futma veya a\u015f\u0131r\u0131 y\u00fcksek devirli hava so\u011futma \u00e7\u00f6z\u00fcmleri (AIO veya \u00f6zel fan kurulumu) 3. kart\u0131n termal hotspots i\u00e7in planland\u0131 m\u0131? &#8211; [ ] G\u00fc\u00e7 Kablolar\u0131: Her kart i\u00e7in ayr\u0131 ayr\u0131 PCIe g\u00fc\u00e7 kablosu \u00e7ekildi mi? Tek bir kablodan iki kart\u0131 beslemeyin; bu, kablo erimesine ve sistem \u00e7\u00f6kmesine yol a\u00e7ar.<\/p>\n<p>Yaz\u0131l\u0131m ve Konfig\u00fcrasyon &#8211; [ ] BIOS Ayarlar\u0131: TRX40 BIOS&#8217;unda PCIe Lane ayarlar\u0131 (x16, x8, x4) manuel olarak do\u011fruland\u0131 m\u0131? Auto ayarlar bazen PCIe 4.0 yerine PCIe 3.0&#8217;e d\u00fc\u015fer. &#8211; [ ] DRAM H\u0131z\u0131 ve Timingle: 256 GB DDR4 RAM&#8217;in \u00e7al\u0131\u015ft\u0131\u011f\u0131 frekans (2933\/3200 MT\/s) ve gecikme de\u011ferleri (CL22\/CL28) stabil olarak test edildi mi? Overclock&#8217;l\u0131 bellek, y\u00fcksek bellek yo\u011funlu\u011funda karars\u0131zl\u0131k yaratabilir. &#8211; [ ] Is\u0131 ve G\u00fc\u00e7 S\u0131n\u0131rland\u0131rmas\u0131 (Power Limit):<\/p>\n<p>NVIDIA driver&#8217;\u0131nda GPU TDP s\u0131n\u0131rlamas\u0131 %90 veya %80 olarak ayarland\u0131 m\u0131? Bu, termal ge\u00e7i\u015fleri \u00f6nler ve \u00f6mr\u00fc uzat\u0131r. &#8211; [ ] Is\u0131 Y\u00f6netim Yaz\u0131l\u0131m\u0131: HWMonitor, GPU-Z ve Fan Control ara\u00e7lar\u0131 kurularak, y\u00fck alt\u0131nda termal durumlar izlenmeye a\u00e7\u0131k m\u0131? &#8211; [ ] VRAM ve Swap Ayar\u0131: Linux veya Windows&#8217;da swap alan\u0131, RAM kullan\u0131mlar\u0131na g\u00f6re (\u00f6rne\u011fin 32GB+ RAM i\u00e7in 128GB+ swap) konfig\u00fcre edildi mi?<\/p>\n<h3 id=\"sorun-giderme-ve-yaygin-hata-analizi\">Sorun Giderme ve Yayg\u0131n Hata Analizi<\/h3>\n<p>Bu sistemde kar\u015f\u0131la\u015faca\u011f\u0131n\u0131z sorunlar genellikle donan\u0131m ar\u0131zas\u0131ndan ziyade, yap\u0131land\u0131rma hatalar\u0131ndan veya termal s\u0131n\u0131rlamalardan kaynaklan\u0131r. \u0130\u015fte en s\u0131k kar\u015f\u0131la\u015f\u0131lan durumlar ve \u00e7\u00f6z\u00fcm yakla\u015f\u0131mlar\u0131.<\/p>\n<p><strong>Problem 1: 3. Kart G\u00f6r\u00fcnm\u00fcyor veya PCIe x4&#8217;de Tak\u0131l\u0131yor<\/strong><br \/>\n&#8211; <strong>Belirti:<\/strong> Sistem, 3. RTX 3090&#8217;\u0131 g\u00f6r\u00fcyor ancak performans %20&#8217;nin alt\u0131nda.<br \/>\n&#8211; <strong>K\u00f6k Neden:<\/strong> TRX40&#8217;te 64 PCIe lane vard\u0131r. 3x GPU kullan\u0131ld\u0131\u011f\u0131nda, baz\u0131 slotlar otomatik olarak x8 veya x4&#8217;e d\u00fc\u015fer. E\u011fer BIOS ayarlar\u0131 &#8220;Auto&#8221; ise, 3. kart x4&#8217;e d\u00fc\u015febilir.<br \/>\n&#8211; <strong>\u00c7\u00f6z\u00fcm:<\/strong> BIOS&#8217;ta PCIe Lane configuration&#8217;\u0131 manuel olarak kontrol edin. E\u011fer x4&#8217;e d\u00fc\u015f\u00fcyorsa, 3. kart\u0131 PCIe x8 slotuna kayd\u0131r\u0131n (varsa) veya donan\u0131m s\u0131n\u0131rlar\u0131n\u0131 kabul edip i\u015f y\u00fck\u00fcn\u00fc buna g\u00f6re optimize edin. PCIe x4&#8217;\u00fc sadece model y\u00fckleme i\u00e7in kullan\u0131n, s\u00fcrekli inference i\u00e7in x8 veya x16 gereklidir.<br \/>\n&#8211; <strong>Ekstra Not:<\/strong> 3x 3090&#8217;\u0131n biri \u00e7al\u0131\u015fm\u0131yorsa, PCIe slotunun tozlan\u0131p tozlanmad\u0131\u011f\u0131n\u0131 veya kablo ba\u011flant\u0131s\u0131n\u0131 kontrol edin.<\/p>\n<p><strong>Problem 2: GPU S\u0131cakl\u0131\u011f\u0131 90\u00b0C&#8217;yi Ge\u00e7iyor ve Termal Throttling<\/strong><br \/>\n&#8211; <strong>Belirti:<\/strong> \u0130\u015flemci y\u00fck\u00fc %100 iken, GPU s\u0131cakl\u0131klar\u0131 h\u0131zla art\u0131yor ve performans d\u00fc\u015f\u00fcyor.<br \/>\n&#8211; <strong>K\u00f6k Neden:<\/strong> Hava ak\u0131\u015f\u0131 bozuk veya VRM s\u0131cakl\u0131\u011f\u0131 y\u00fcksek. TRX40 platformu, GPU&#8217;lar\u0131n arkas\u0131nda kalan \u0131s\u0131y\u0131 kabin i\u00e7inde hapsedebilir.<br \/>\n&#8211; <strong>\u00c7\u00f6z\u00fcm:<\/strong> Fan h\u0131z e\u011frilerini (fan curve) manuel olarak ayarlay\u0131n. Hava ak\u0131\u015f\u0131n\u0131 art\u0131r\u0131n. E\u011fer sorun devam ederse, PSU veya anakart VRM s\u0131cakl\u0131\u011f\u0131n\u0131 kontrol edin. VRM a\u015f\u0131r\u0131 \u0131s\u0131n\u0131r ise, sistemin kapanmas\u0131 veya yava\u015flamas\u0131 ka\u00e7\u0131n\u0131lmazd\u0131r.<br \/>\n&#8211; <strong>Ekstra Not:<\/strong> 3090&#8217;lar\u0131n hotspots&#8217;lar\u0131 100\u00b0C+ olabilir. Bu, VRAM&#8217;\u0131n (GDDR6X) a\u015f\u0131r\u0131 \u0131s\u0131nmas\u0131ndan kaynaklan\u0131r. VRAM s\u0131cakl\u0131\u011f\u0131n\u0131 d\u00fc\u015f\u00fcrmek i\u00e7in GPU fan h\u0131z\u0131n\u0131 %100&#8217;e yak\u0131n bir seviyeye ayarlay\u0131n.<\/p>\n<p><strong>Problem 3: Ollama\/vLLM \u00c7al\u0131\u015fm\u0131yor veya &#8220;Out of Memory&#8221; Hatas\u0131<\/strong><br \/>\n&#8211; <strong>Belirti:<\/strong> Model y\u00fcklenirken bellek hatas\u0131 al\u0131n\u0131yor veya i\u015flem sona eriyor.<br \/>\n&#8211; <strong>K\u00f6k Neden:<\/strong> Toplam VRAM (24GB x 3 = 72GB) model parametrelerine ve context uzunlu\u011funa yetmiyor. DDR4 RAM&#8217;in bant geni\u015fli\u011fi yetersiz kal\u0131yor.<br \/>\n&#8211; <strong>\u00c7\u00f6z\u00fcm:<\/strong> Modeli daha d\u00fc\u015f\u00fck quantize (Q4_K_M, Q5_K_M) bir formatta y\u00fckleyin. Context uzunlu\u011funu (max_tokens) azalt\u0131n. E\u011fer RAM yeterliyse, sistem swap alan\u0131n\u0131 art\u0131r\u0131n ancak bunun performans etkisi olaca\u011f\u0131n\u0131 unutmay\u0131n.<br \/>\n&#8211; <strong>Ekstra Not:<\/strong> 70B model i\u00e7in 3x 3090 (72GB VRAM) yetmeyebilir. 70B model (Q4 quantize) yakla\u015f\u0131k 40-45GB VRAM gerektirir. 3x 3090 ile \u00e7al\u0131\u015fabilir ancak bellek y\u00f6netimi \u00e7ok kritiktir.<\/p>\n<p><strong>Problem 4: Sistem Karars\u0131zl\u0131\u011f\u0131 veya Anl\u0131k \u00c7\u00f6k\u00fc\u015fler<\/strong><br \/>\n&#8211; <strong>Belirti:<\/strong> Rastgele \u00e7\u00f6k\u00fc\u015fler veya &#8220;Blue Screen of Death&#8221; (BSOD).<br \/>\n&#8211; <strong>K\u00f6k Neden:<\/strong> PSU yetersiz kalabilir, RAM uyumsuzlu\u011fu veya termal ge\u00e7i\u015fler.<br \/>\n&#8211; <strong>\u00c7\u00f6z\u00fcm:<\/strong> PSU y\u00fck\u00fcn\u00fc \u00f6l\u00e7\u00fcn. RAM&#8217;i (256 GB) tek tek test edin. BIOS ayarlar\u0131n\u0131 fabrika varsay\u0131lanlar\u0131na d\u00f6nd\u00fcr\u00fcp, yava\u015f yava\u015f art\u0131r\u0131n.<\/p>\n<h3 id=\"karsilastirma-tablosu-model-yukleme-ve-yonetim-stratejileri\">Kar\u015f\u0131la\u015ft\u0131rma Tablosu: Model Y\u00fckleme ve Y\u00f6netim Stratejileri<\/h3>\n<table>\n<thead>\n<tr>\n<th>\u00d6zellik<\/th>\n<th>vLLM (vLLM)<\/th>\n<th>llama.cpp (GGUF)<\/th>\n<th>Ollama<\/th>\n<\/tr>\n<\/thead>\n<tbody>\n<tr>\n<td><strong>Maksimum VRAM Verimi<\/strong><\/td>\n<td>Y\u00fcksek (Paged Attention)<\/td>\n<td>Orta (CPU\/GPU kar\u0131\u015f\u0131k)<\/td>\n<td>D\u00fc\u015f\u00fck (Otomatik optimizasyon)<\/td>\n<\/tr>\n<tr>\n<td><strong>TTFT (\u0130lk Token Zaman\u0131)<\/strong><\/td>\n<td>En D\u00fc\u015f\u00fck<\/td>\n<td>Orta\/Y\u00fcksek<\/td>\n<td>Orta<\/td>\n<\/tr>\n<tr>\n<td><strong>Model B\u00fcy\u00fckl\u00fc\u011f\u00fc Deste\u011fi<\/strong><\/td>\n<td>13B &#8211; 70B+ (Quantize)<\/td>\n<td>7B &#8211; 70B+ (Quantize)<\/td>\n<td>7B &#8211; 70B+ (Quantize)<\/td>\n<\/tr>\n<tr>\n<td><strong>Kurulum Karma\u015f\u0131kl\u0131\u011f\u0131<\/strong><\/td>\n<td>Orta (Python\/Container)<\/td>\n<td>D\u00fc\u015f\u00fck\/Orta (C++\/Binary)<\/td>\n<td>D\u00fc\u015f\u00fck (Single Binary)<\/td>\n<\/tr>\n<tr>\n<td><strong>Yerel Sunucu i\u00e7in Uygunluk<\/strong><\/td>\n<td>Y\u00fcksek (Y\u00fcksek TPS)<\/td>\n<td>Y\u00fcksek (Esneklik)<\/td>\n<td>Orta (Basitlik)<\/td>\n<\/tr>\n<tr>\n<td><strong>Kritik Nokta<\/strong><\/td>\n<td>Bellek Y\u00f6netimi<\/td>\n<td>Model Quantize Seviyesi<\/td>\n<td>Kullan\u0131c\u0131 Aray\u00fcz\u00fc<\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<h3 id=\"pratik-senaryo-model-degisimi-ve-bellek-yonetimi\">Pratik Senaryo: Model De\u011fi\u015fimi ve Bellek Y\u00f6netimi<\/h3>\n<p>Sistemde model de\u011fi\u015ftirmek, 3x 3090&#8217;\u0131n tamam\u0131n\u0131 kulland\u0131\u011f\u0131n\u0131zda en zorlu ad\u0131md\u0131r. vLLM, bellek y\u00f6netimi i\u00e7in &#8220;Paged Attention&#8221; kullan\u0131r ancak bu, modelin bellekten silinip yeniden y\u00fcklenmesi gerekti\u011finde, PCIe bant geni\u015fli\u011finin s\u0131n\u0131rlar\u0131n\u0131 g\u00f6sterir.<\/p>\n<p>\u00d6rne\u011fin, 30B modelini (Q4 quantize) \u00e7al\u0131\u015ft\u0131r\u0131rken, 3x 3090&#8217;\u0131n her birinde 12GB VRAM kullan\u0131l\u0131r. Toplam 36GB VRAM doludur. 70B modelini (Q4 quantize) y\u00fcklemek i\u00e7in 45GB VRAM gerekir. Sistem, RAM&#8217;den VRAM&#8217;a veri aktar\u0131m\u0131 yapar. Bu i\u015flem, PCIe x8 veya x4 yoluyla yap\u0131l\u0131r ve 10-20 saniye s\u00fcrer. Bu s\u00fcre, &#8220;model de\u011fi\u015ftirme&#8221; olarak adland\u0131r\u0131l\u0131r. E\u011fer bu i\u015flemi s\u0131k yaparsan\u0131z, sistem s\u00fcrekli &#8220;transfer&#8221; modunda kal\u0131r ve ger\u00e7ek inference performans\u0131 d\u00fc\u015fer.<\/p>\n<p>\u00c7\u00f6z\u00fcm: Modeli, VRAM&#8217;\u0131n %80&#8217;ini kullanacak \u015fekilde ayarlay\u0131n. 3x 3090&#8217;\u0131n her birinde 18GB VRAM kullan\u0131n. Toplam 54GB VRAM. Bu, 70B modeli (Q4) i\u00e7in yeterli olabilir. Ancak, 70B model (Q4) 45GB VRAM gerektirir. 3x 3090&#8217;\u0131n tamam\u0131 kullan\u0131l\u0131r. 70B modeli (Q4) y\u00fcklenirken, 3x 3090&#8217;\u0131n her birinde 15GB VRAM kullan\u0131l\u0131r. Toplam 45GB VRAM. Bu, 3x 3090&#8217;\u0131n tamam\u0131n\u0131 kullan\u0131r.<\/p>\n<p>Model de\u011fi\u015fimi s\u0131ras\u0131nda, RAM&#8217;den VRAM&#8217;a veri aktar\u0131m\u0131, PCIe bant geni\u015fli\u011finin s\u0131n\u0131rlar\u0131n\u0131 g\u00f6sterir. Bu, sistemin &#8220;transfer&#8221; modunda kalmas\u0131na neden olur. Ger\u00e7ek inference performans\u0131, model de\u011fi\u015fimi s\u0131ras\u0131nda d\u00fc\u015fer. Bu, &#8220;model de\u011fi\u015ftirme&#8221; zaman\u0131n\u0131n uzamas\u0131 anlam\u0131na gelir.<\/p>\n<h3 id=\"sikca-sorulan-sorular-sss-2\">S\u0131k\u00e7a Sorulan Sorular (SSS)<\/h3>\n<p><strong>1. TRX40 platformu 3x 3090 ile ne kadar \u0131s\u0131n\u0131r?<\/strong><br \/>\nTRX40 ve 3x 3090 kombinasyonu, kabin i\u00e7inde ciddi bir \u0131s\u0131 birikimi olu\u015fturur. Termal olarak, GPU s\u0131cakl\u0131klar\u0131 85-95\u00b0C aras\u0131na \u00e7\u0131kabilir. Ancak en kritik nokta, VRM ve RAM s\u0131cakl\u0131\u011f\u0131d\u0131r. TRX40&#8217;un VRM&#8217;leri, bu yo\u011funlukta 70\u00b0C&#8217;yi ge\u00e7erse, performans d\u00fc\u015f\u00fc\u015f\u00fc ba\u015flar. Kabin havas\u0131n\u0131 so\u011futmak i\u00e7in ekstra fanlar veya s\u0131v\u0131 so\u011futma gereklidir.<\/p>\n<p><strong>2. vLLM ve llama.cpp aras\u0131nda bu donan\u0131mda fark nedir?<\/strong><br \/>\nvLLM, y\u00fcksek throughput ve d\u00fc\u015f\u00fck TTFT i\u00e7in optimize edilmi\u015ftir. Modelin bellekte kalmas\u0131 ve h\u0131zl\u0131ca yan\u0131t vermesi i\u00e7in idealdir. llama.cpp ise, modelin CPU ve GPU aras\u0131nda b\u00f6l\u00fcnmesi veya sadece GPU kullan\u0131m\u0131 i\u00e7in daha esnektir. 3x 3090 gibi b\u00fcy\u00fck bir sistemde, vLLM genellikle daha h\u0131zl\u0131d\u0131r, ancak llama.cpp daha fazla model t\u00fcr\u00fcn\u00fc destekler.<\/p>\n<p><strong>3. Elektrik faturas\u0131 ne kadar artar?<\/strong><br \/>\n3x 3090 + Threadripper 3960X, y\u00fck alt\u0131nda yakla\u015f\u0131k 1.2-1.5 kW \u00e7eker. S\u00fcrekli \u00e7al\u0131\u015ft\u0131r\u0131ld\u0131\u011f\u0131nda, ayl\u0131k elektrik faturas\u0131 %20-30 artabilir. Ayr\u0131ca, so\u011futma maliyeti (fanlar, havaland\u0131rma) de eklenir.<\/p>\n<p><strong>4. PCIe x4\/x8 darbo\u011faz\u0131 performans\u0131m\u0131 ne kadar etkiler?<\/strong><br \/>\nPCIe x4, model y\u00fckleme s\u0131ras\u0131nda kritik bir darbo\u011fazd\u0131r. Inference s\u0131ras\u0131nda, x8 veya x16 yeterli olabilir. Ancak, model de\u011fi\u015ftirme s\u0131ras\u0131nda x4, s\u00fcreyi 2-3 kat art\u0131r\u0131r. TRX40&#8217;te PCIe lane da\u011f\u0131l\u0131m\u0131, bu darbo\u011faz\u0131 \u00f6nlemek i\u00e7in \u00f6nemlidir.<\/p>\n<p><strong>5. Bu sistemle 70B parametreli modeli \u00e7al\u0131\u015ft\u0131rabilir miyim?<\/strong><br \/>\nEvet, 3x 3090 (72GB VRAM) 70B modeli (Q4 quantize) i\u00e7in yeterlidir. Ancak, modelin VRAM&#8217;\u0131 %100 kullanmas\u0131 gerekir. Bu durumda, model de\u011fi\u015fimi s\u0131ras\u0131nda RAM&#8217;den VRAM&#8217;a veri aktar\u0131m\u0131 kritik bir rol oynar.<\/p>\n<p><strong>6. 3x 3090 i\u00e7in BIOS ayarlar\u0131 nas\u0131l yap\u0131lmal\u0131?<\/strong><br \/>\nBIOS&#8217;ta PCIe lane ayarlar\u0131n\u0131 manuel olarak kontrol edin. 3x 3090 i\u00e7in x16\/x16\/x4 veya x16\/x8\/x8 ayarlar\u0131 kullan\u0131lmal\u0131d\u0131r. Auto ayarlar, baz\u0131 durumlarda x4&#8217;e d\u00fc\u015febilir.<\/p>\n<p><strong>7. vLLM ile 3x 3090&#8217;\u0131 nas\u0131l optimize ederim?<\/strong><br \/>\nvLLM i\u00e7in, modelin bellekte kalmas\u0131n\u0131 sa\u011flay\u0131n. Model de\u011fi\u015fimi s\u0131ras\u0131nda, RAM&#8217;den VRAM&#8217;a veri aktar\u0131m\u0131n\u0131 minimize edin. Modelin quantize seviyesini ayarlay\u0131n.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>Threadripper 3960X ASUS Prime: Learn Yerel Yapay Zek\u00e2 Sunucusu with clear steps, practical guidance, and real-world decisions for a faster, cleaner outcome. Thr<\/p>\n","protected":false},"author":1,"featured_media":628,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"rank_math_title":"Threadripper 3960X ASUS Prime: 2026 Guide","rank_math_description":"Threadripper 3960X ASUS Prime: Learn Yerel Yapay Zek\u00e2 Sunucusu with clear steps, practical guidance, and real-world decisions for a faster, cleaner outcome. Thr","rank_math_focus_keyword":"Threadripper 3960X ASUS Prime","footnotes":""},"categories":[1],"tags":[270,269,271,256,268,267],"class_list":["post-630","post","type-post","status-publish","format-standard","has-post-thumbnail","category-genel","tag-rtx-3090","tag-threadripper-3960x","tag-trx40","tag-vllm","tag-yerel-yapay-zeka","tag-yerel-yapay-zeka-sunucusu"],"_links":{"self":[{"href":"https:\/\/m4.ist\/index.php\/wp-json\/wp\/v2\/posts\/630","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/m4.ist\/index.php\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/m4.ist\/index.php\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/m4.ist\/index.php\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/m4.ist\/index.php\/wp-json\/wp\/v2\/comments?post=630"}],"version-history":[{"count":0,"href":"https:\/\/m4.ist\/index.php\/wp-json\/wp\/v2\/posts\/630\/revisions"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/m4.ist\/index.php\/wp-json\/wp\/v2\/media\/628"}],"wp:attachment":[{"href":"https:\/\/m4.ist\/index.php\/wp-json\/wp\/v2\/media?parent=630"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/m4.ist\/index.php\/wp-json\/wp\/v2\/categories?post=630"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/m4.ist\/index.php\/wp-json\/wp\/v2\/tags?post=630"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}