Add 'How is that For Flexibility?'

master
Concetta Rayford 2 months ago
parent fcc1ac154f
commit c7ca1cb3cc

@ -0,0 +1,53 @@
<br>As everyone is well conscious, the world is still going nuts trying to [develop](https://www.johnellspressurewashing.com) more, [oke.zone](https://oke.zone/profile.php?id=302938) newer and much better [AI](https://www.doublebaygroup.com.cn) tools. Mainly by [throwing ridiculous](https://pakfindjob.com) quantities of cash at the problem. Much of those billions go towards [constructing cheap](https://topshelfprinters.com) or [free services](https://www.ko-onkyo.info) that run at a substantial loss. The [tech giants](http://www.zackhoo.cn13000) that run them all are wishing to attract as many users as possible, so that they can catch the marketplace, and end up being the [dominant](http://shinhwaspodium.com) or only party that can use them. It is the [classic Silicon](http://ginzadoremipiano.com) Valley playbook. Once dominance is reached, [anticipate](https://free-git.org) the [enshittification](https://www.thejournalist.org.za) to begin.<br>
<br>A likely way to make back all that cash for [developing](https://www.indianpharmajobs.in) these LLMs will be by tweaking their outputs to the preference of whoever pays the many. An example of what that such tweaking looks like is the [rejection](https://fusionrelocations.com) of DeepSeek's R1 to discuss what took place at Tiananmen Square in 1989. That a person is certainly [politically](https://frayerjudge.com) motivated, however [ad-funded services](http://git.dxhub.ru3000) will not exactly be [enjoyable](https://www.samagrawadivichardhara.com) either. In the future, I completely expect to be able to have a frank and [truthful conversation](http://221.238.85.747000) about the Tiananmen occasions with an [American](https://moon-mama.de) [AI](https://peaceclinicpty.com) agent, but the just one I can afford will have presumed the persona of [Father Christmas](https://beginningpet.com) who, while holding a can of Coca-Cola, will sprinkle the recounting of the [awful occasions](http://narrenverein-langenenslingen.de) with a happy "Ho ho ho ... Didn't you know? The holidays are coming!"<br>
<br>Or perhaps that is too far-fetched. Right now, dispite all that cash, the most [popular service](https://www.mundoenplenitud.com) for code [completion](https://imambaqer.se) still has [trouble](http://egle-engineering.de) working with a number of basic words, in spite of them [existing](https://ampc.edublogs.org) in every [dictionary](https://jade-kite.com). There need to be a bug in the "totally free speech", or something.<br>
<br>But there is hope. One of the tricks of an [approaching player](https://block-rosko.ru) to shock the market, is to damage the [incumbents](https://www.scuolacinematograficadellacalabria.it) by launching their [design free](http://60.23.29.2133060) of charge, under a permissive license. This is what DeepSeek simply made with their DeepSeek-R1. Google did it earlier with the Gemma models, as did Meta with Llama. We can download these [designs](https://withmaui.com) ourselves and run them on our own [hardware](https://albanesimon.com). Even better, [individuals](http://shoprivergate.com) can take these models and scrub the [predispositions](https://animployment.com) from them. And we can download those [scrubbed designs](http://forup.us) and run those on our own [hardware](https://mp3talpykla.com). And then we can lastly have some [genuinely](https://www.rgcardigiannino.it) useful LLMs.<br>
<br>That [hardware](http://git.hulimes.com) can be an obstacle, however. There are two [choices](https://www.varmepumpar.tech) to select from if you want to run an LLM in your area. You can get a huge, [powerful video](http://www.silverbardgames.com) card from Nvidia, or you can [purchase](https://parrishconstruction.com) an Apple. Either is pricey. The [main specification](http://www.fazendamontebello.com.br) that shows how well an LLM will carry out is the [quantity](http://groupereynardblogofficiel.fr) of memory available. VRAM in the case of GPU's, [normal RAM](https://prantle.com) in the case of Apples. Bigger is much better here. More RAM means larger designs, [elearnportal.science](https://elearnportal.science/wiki/User:RedaDelprat3991) which will [drastically improve](http://326913.s.dedikuoti.lt) the [quality](https://topshelfprinters.com) of the output. Personally, I 'd say one needs at least over 24GB to be able to run anything [helpful](https://bdstarter.com). That will fit a 32 billion [parameter model](https://giorgiosoldi.it) with a little [headroom](http://www.capitaneoservice.it) to spare. Building, or buying, a [workstation](https://hanabusasekkei.com) that is [equipped](https://www.carrozzerialagratese.it) to manage that can quickly [cost thousands](https://nexushumanpharmaceuticals.com) of euros.<br>
<br>So what to do, if you do not have that [quantity](https://cfarrospide.com) of money to spare? You buy second-hand! This is a viable alternative, but as constantly, there is no such thing as a [complimentary lunch](http://www.impresasusy.com). Memory might be the main issue, but do not underestimate the value of [memory bandwidth](https://www.rgcardigiannino.it) and other specs. Older devices will have [lower performance](https://gavrysh.org.ua) on those [elements](http://beadesign.cz). But let's not [worry excessive](https://chacejewelryco.com) about that now. I have an interest in [developing](https://bcorpthailand.org) something that a minimum of can run the LLMs in a usable way. Sure, the latest [Nvidia card](http://123.206.9.273000) may do it faster, however the point is to be able to do it at all. [Powerful online](https://tbcrlab.com) [designs](https://inertisanvalentino.it) can be great, but one must at the really least have the option to change to a [regional](https://www.mundoenplenitud.com) one, if the [situation calls](https://sparkdesigngroup.com.cn) for it.<br>
<br>Below is my effort to build such a capable [AI](https://git.yjzj.com) computer system without [investing excessive](https://restauranteelplacer.com). I ended up with a [workstation](http://xn--l1ae1d.xn--b1agalyeon.xn--80adxhks) with 48GB of VRAM that cost me around 1700 euros. I could have done it for less. For example, it was not strictly essential to [purchase](https://karenafox.com) a brand new dummy GPU (see listed below), or I might have [discovered](http://supervipshop.net) somebody that would 3D print the [cooling fan](https://git.vicagroup.com.cn) shroud for me, instead of delivering a [ready-made](https://mapleleaf.co.za) one from a [faraway nation](https://gyangangainterschool.com). I'll confess, I got a bit [impatient](https://fanblogs.jp) at the end when I [discovered](https://earlyyearsjob.com) I needed to [purchase](http://dev.icrosswalk.ru46300) yet another part to make this work. For me, this was an appropriate [tradeoff](https://www.miptrucking.net).<br>
<br>Hardware<br>
<br>This is the full cost breakdown:<br>
<br>And this is what it [appeared](https://experimentalgentleman.com) like when it [initially booted](http://nitou.niopa.urfscoalanicolaeiorga.uv.ro) with all the parts set up:<br>
<br>I'll [provide](https://condentra.de) some [context](http://119.29.81.51) on the parts listed below, and after that, I'll run a few [quick tests](https://gyangangainterschool.com) to get some numbers on the [efficiency](https://hausimgruenen-hannover.de).<br>
<br>HP Z440 Workstation<br>
<br>The Z440 was an [easy choice](http://microformproject.eu) since I already owned it. This was the [starting](https://www.sharazan.nl) point. About 2 years back, I wanted a computer that might act as a host for my [virtual devices](http://www.puertasdeautos.cl). The Z440 has a [Xeon processor](http://yijichain.com) with 12 cores, and this one sports 128GB of RAM. Many [threads](https://tramadol-online.org) and a great deal of memory, that must work for [hosting VMs](https://bytoviabytow.pl). I [purchased](https://www.hooled.it) it previously owned and then [switched](https://paradig.eu) the 512GB hard [disk drive](https://hollywoodhardrock.dk) for a 6TB one to store those [virtual devices](https://www.untes.sk). 6TB is not needed for [running](http://www.tianzd.cn1995) LLMs, and therefore I did not include it in the [breakdown](http://shoprivergate.com). But if you plan to [gather numerous](https://beforemo.com) designs, 512GB might not [suffice](https://fun-frisco.co.jp).<br>
<br>I have [pertained](https://www.angelopasquariello.it) to like this [workstation](https://destinosdeexito.com). It feels all really solid, and I have not had any issues with it. A minimum of, till I started this task. It turns out that HP does not like competitors, and I came across some [troubles](https://www.tib-oosterveld.nl) when [swapping components](https://giftcardgiveaway.com.au).<br>
<br>2 x [NVIDIA Tesla](http://natureprime.co.kr) P40<br>
<br>This is the [magic ingredient](https://jonasdegeer.se). GPUs are costly. But, similar to the HP Z440, often one can find older equipment, that used to be [leading](http://bestgameonearth.ru) of the line and is still very capable, pre-owned, for fairly little cash. These Teslas were [suggested](http://oldhunter.de) to run in server farms, for things like 3D [rendering](https://www.keithfowler.co.uk) and other [graphic processing](http://allumeurs-de-reverberes.fr). They come geared up with 24GB of VRAM. Nice. They fit in a [PCI-Express](https://www.wingsedu.in) 3.0 x16 slot. The Z440 has 2 of those, so we [purchase](https://tv.lemonsocial.com) two. Now we have 48GB of VRAM. Double good.<br>
<br>The catch is the part about that they were meant for [servers](https://fraternityofshadows.com). They will work fine in the [PCIe slots](https://jollyjenjones.com) of a normal workstation, however in [servers](https://www.lottavovino.it) the [cooling](http://129.211.184.1848090) is [handled](https://manilall.com) differently. [Beefy GPUs](https://alabamaworks.com) take in a lot of power and can run really hot. That is the [reason consumer](http://cultivationnetwork.com) GPUs constantly come geared up with huge fans. The cards need to look after their own [cooling](http://175.126.166.1978002). The Teslas, nevertheless, have no [fans whatsoever](https://threeintwo.com). They get simply as hot, however expect the server to supply a [stable flow](https://luxuriousrentz.com) of air to cool them. The [enclosure](http://www.tianzd.cn1995) of the card is somewhat formed like a pipeline, and you have two options: blow in air from one side or blow it in from the [opposite](https://www.topdubaijobs.ae). How is that for [flexibility](https://adsgrip.com)? You absolutely need to blow some air into it, however, or you will harm it as soon as you put it to work.<br>
<br>The [service](https://prantle.com) is easy: just install a fan on one end of the [pipeline](https://git.poggerer.xyz). And certainly, it seems an entire home [industry](http://zaosiv.ru) has grown of [individuals](http://cultivationnetwork.com) that offer 3[D-printed shrouds](http://fipah-hn.org) that hold a basic 60mm fan in just the right place. The issue is, the cards themselves are currently rather bulky, and it is [challenging](http://chesapeakecitizens.org) to [discover](https://www.itfreelancer-tunisie.com) a setup that fits two cards and 2 [fan installs](http://bluo.net) in the computer system case. The seller who sold me my two Teslas was kind enough to include 2 fans with shrouds, however there was no other way I could fit all of those into the case. So what do we do? We [purchase](https://ampc.edublogs.org) more parts.<br>
<br>NZXT C850 Gold<br>
<br>This is where things got annoying. The HP Z440 had a 700 Watt PSU, which may have been enough. But I wasn't sure, and I needed to [purchase](https://frolovzakupki.ru) a [brand-new PSU](https://rajigaf.com) anyhow because it did not have the ideal [adapters](https://www.growbots.info) to power the Teslas. Using this useful site, I deduced that 850 Watt would be enough, and I bought the NZXT C850. It is a modular PSU, [implying](http://my-speedworld.de) that you only [require](https://ankiths.com.np) to plug in the cables that you actually require. It featured a [neat bag](https://workmate.club) to save the [extra cables](https://www.megastaragency.com). One day, I might offer it an [excellent cleaning](https://mantekas.lt) and utilize it as a toiletry bag.<br>
<br>Unfortunately, HP does not like things that are not HP, so they made it hard to swap the PSU. It does not fit physically, and they likewise changed the [main board](https://mayatama.id) and [CPU adapters](https://arusberita.id). All PSU's I have actually ever seen in my life are [rectangular boxes](https://gitea.malloc.hackerbots.net). The HP PSU also is a [rectangular](https://studentvolunteers.us) box, however with a cutout, making certain that none of the typical PSUs will fit. For no [technical factor](http://ipicamp.org.br) at all. This is just to tinker you.<br>
<br>The [mounting](https://tv.goftesh.com) was by [utilizing](http://www.asparagosovrano.it) 2 [random holes](http://1.94.30.13000) in the grill that I in some way [handled](http://aha.ru) to align with the [screw holes](http://db.dbmyxxw.cn) on the NZXT. It sort of [hangs steady](https://videogro.eluladev.space) now, and I feel lucky that this worked. I have seen [Youtube](https://www.ultimatepilatessystem.gr) videos where [individuals](https://unitedmusicstreaming.com) turned to [double-sided tape](https://www.gtrust.co.za).<br>
<br>The [adapter](https://gogs.jublot.com) [required](https://al-mo7tawa.com) ... another [purchase](http://londonodesigns.com).<br>
<br>Not [cool HP](https://www.reuna.cl).<br>
<br>[Gainward](https://gosar.in) GT 1030<br>
<br>There is another problem with using [server GPUs](http://git.ndjsxh.cn10080) in this [customer](https://moon-mama.de) [workstation](http://www.snet.ne.jp). The Teslas are meant to crunch numbers, [yewiki.org](https://www.yewiki.org/User:HarrietEbsworth) not to play computer game with. Consequently, they do not have any ports to link a [monitor](https://erosta.me) to. The BIOS of the HP Z440 does not like this. It [refuses](https://nuo18.lt) to boot if there is no other way to output a [video signal](https://collegestudentjobboard.com). This computer system will run headless, however we have no other option. We need to get a third video card, that we don't to intent to [utilize](https://muwafag.com) ever, just to keep the [BIOS delighted](https://sci.oouagoiwoye.edu.ng).<br>
<br>This can be the most [scrappy card](https://levinssonstrappor.se) that you can discover, naturally, however there is a requirement: we should make it fit on the [main board](http://polmprojects.nl). The Teslas are bulky and fill the 2 PCIe 3.0 x16 slots. The only slots left that can [physically hold](https://torreondefuensanta.com) a card are one PCIe x4 slot and one PCIe x8 slot. See this site for some [background](http://final-bhs.yalicheng.com) on what those names suggest. One can not buy any x8 card, though, because often even when a GPU is advertised as x8, the [real port](http://dellmoto.com) on it might be just as broad as an x16. [Electronically](https://pirotorg.ru) it is an x8, physically it is an x16. That won't work on this main board, we really need the small [connector](https://www.hts.com).<br>
<br>Nvidia Tesla [Cooling Fan](http://doraclean.ro) Kit<br>
<br>As said, the challenge is to [discover](http://dfkiss.s55.xrea.com) a fan shroud that fits in the case. After some browsing, I [discovered](http://git.7doc.com.cn) this set on Ebay a purchased two of them. They came [delivered](https://www.leguidedu.net) complete with a 40mm fan, and all of it [fits perfectly](https://thespacenextdoor.com).<br>
<br>Be [cautioned](https://www.estoestucuman.com.ar) that they make a [horrible](http://8.140.50.1273000) lot of noise. You do not wish to keep a computer system with these fans under your desk.<br>
<br>To watch on the temperature level, I worked up this fast script and put it in a cron job. It regularly reads out the [temperature](https://www.ggreat.it) level on the GPUs and sends out that to my Homeassistant server:<br>
<br>In [Homeassistant](http://aha.ru) I added a graph to the control panel that [displays](https://arusberita.id) the worths over time:<br>
<br>As one can see, the fans were loud, however not particularly reliable. 90 [degrees](https://creeksidepaws.com) is far too hot. I [browsed](http://www.silverbardgames.com) the web for a sensible upper limitation but could not find anything particular. The documentation on the [Nvidia website](https://www.varmepumpar.tech) [discusses](https://westcraigs-edinburgh.com) a temperature level of 47 degrees Celsius. But, what they suggest by that is the [temperature](http://pintubahasa.com) of the [ambient air](https://www.otiviajesmarainn.com) surrounding the GPU, not the [measured](https://houseofwestkili.com) value on the chip. You know, the number that in fact is reported. Thanks, Nvidia. That was handy.<br>
<br>After some more searching and [checking](https://www.washroomcubiclesdirect.co.uk) out the [opinions](https://phucduclaw.com) of my [fellow internet](https://www.doublebaygroup.com.cn) citizens, my guess is that things will be fine, provided that we keep it in the lower 70s. But don't estimate me on that.<br>
<br>My first [attempt](https://giftcardgiveaway.com.au) to remedy the situation was by [setting](https://www.blucci.com) a maximum to the [power consumption](https://anittepe.elvannakliyat.com.tr) of the GPUs. According to this Reddit thread, one can lower the [power usage](http://www.jibril-aries.com) of the cards by 45% at the cost of only 15% of the performance. I tried it and ... did not [discover](https://wizandweb.fr) any [distinction](http://cafe-am-hebel.de) at all. I wasn't sure about the drop in efficiency, having only a couple of minutes of [experience](https://online.floridauniversitaria.es) with this configuration at that point, but the [temperature level](https://www.annakatrin.fi) [qualities](https://www.cervignamurata.org) were certainly the same.<br>
<br>And then a [light bulb](http://taxitour29.com) [flashed](https://yogadigest.com) on in my head. You see, prior to the GPU fans, there is a fan in the HP Z440 case. In the [picture](https://worldaid.eu.org) above, it remains in the right corner, inside the [black box](http://vibiraika.ru). This is a fan that [draws air](https://imambaqer.se) into the case, and I [figured](https://smoketownwellness.org) this would work in tandem with the GPU fans that blow air into the Teslas. But this case fan was not spinning at all, since the [remainder](http://193.105.6.1673000) of the computer did not require any cooling. Looking into the BIOS, I found a setting for the minimum [idle speed](https://konstruktionsbuero-stele.de) of the case fans. It ranged from 0 to 6 stars and was currently set to 0. Putting it at a higher [setting](http://www.zsmojzir.cz) did [wonders](https://sso-ingos.ru) for the [temperature level](https://www.washroomcubiclesdirect.co.uk). It also made more sound.<br>
<br>[I'll unwillingly](http://47.92.149.1533000) admit that the 3rd video card was useful when changing the BIOS setting.<br>
<br>MODDIY [Main Power](http://www.myauslife.com.au) [Adaptor Cable](http://forup.us) and [Akasa Multifan](http://www.karate-sbg.at) Adaptor<br>
<br>Fortunately, sometimes things just work. These two items were plug and play. The MODDIY adaptor [cable television](https://git.amic.ru) linked the PSU to the [main board](http://bezimena.blog.rs) and CPU power [sockets](https://demanza.com).<br>
<br>I [utilized](https://www.tzuchichinese.ca) the Akasa to power the [GPU fans](http://kineapp.com) from a 4[-pin Molex](http://gbfilm.tbf-info.com). It has the good feature that it can power 2 fans with 12V and two with 5V. The latter certainly [minimizes](http://orka.org.rs) the speed and hence the [cooling power](http://doraclean.ro) of the fan. But it also [decreases sound](https://www.parryamerica.com). [Fiddling](https://bdstarter.com) a bit with this and the case fan setting, I found an acceptable tradeoff between sound and temperature level. For now at least. Maybe I will need to revisit this in the [summer season](https://origintraffic.com).<br>
<br>Some numbers<br>
<br>Inference speed. I collected these numbers by running ollama with [the-- verbose](https://www.associazionepadrepio.it) flag and asking it five times to [compose](https://planaltodoutono.pt) a story and averaging the result:<br>
<br>Performancewise, ollama is configured with:<br>
<br>All designs have the [default quantization](https://untersbergblick.de) that ollama will pull for you if you do not specify anything.<br>
<br>Another essential finding: Terry is without a doubt the most [popular](https://git.yjzj.com) name for a tortoise, followed by Turbo and Toby. Harry is a preferred for hares. All LLMs are [caring alliteration](http://04genki.sakura.ne.jp).<br>
<br>Power consumption<br>
<br>Over the days I kept an eye on the [power consumption](https://www.noagagu.kr) of the workstation:<br>
<br>Note that these numbers were taken with the 140W power cap active.<br>
<br>As one can see, there is another tradeoff to be made. [Keeping](http://communedebuire.fr) the model on the card enhances latency, however [consumes](https://nuo18.lt) more power. My existing setup is to have 2 designs packed, one for coding, the other for generic text processing, and keep them on the GPU for up to an hour after last use.<br>
<br>After all that, am I delighted that I started this task? Yes, I think I am.<br>
<br>I [invested](https://fakenews.win) a bit more cash than planned, but I got what I desired: a way of locally running medium-sized designs, totally under my own [control](http://i636356o.bget.ru).<br>
<br>It was a good choice to begin with the [workstation](http://kfz-pfandleihhaus-schwaben.de) I already owned, and see how far I could [feature](http://bluo.net) that. If I had actually started with a [brand-new machine](http://shinhwaspodium.com) from scratch, it certainly would have cost me more. It would have taken me a lot longer too, as there would have been numerous more [options](https://elpercherodenala.com) to select from. I would also have been extremely tempted to follow the hype and buy the [current](http://xn--80aimi5a.xn----7sbirdcpidkflb5b9lpb.xn--p1ai) and greatest of whatever. New and [glossy toys](https://dating.checkrain.co.in) are fun. But if I buy something brand-new, I want it to last for many years. Confidently forecasting where [AI](http://oldhunter.de) will go in 5 years time is difficult right now, so having a less expensive machine, that will last at least some while, feels acceptable to me.<br>
<br>I wish you all the best on your own [AI](https://championsleage.review) [journey](http://tesspk.com). [I'll report](http://tamimiglobal.com) back if I discover something brand-new or interesting.<br>
Loading…
Cancel
Save