The Return of 2-2-2: Corsair 3200XL & Samsung PC4000
by Wesley Fink on June 15, 2004 9:00 PM EST- Posted in
- Memory
AMD Athlon 64 Performance Test Configuration
The recently introduced nForce3-250 and VIA K8T800 PRO chipsets finally promise a working PCI/AGP lock to Athlon 64. Earlier chipsets only had a working AGP/PCI lock on a handful of boards such as the AOpen AK89 Max. Thus far, we have tested the Epox, MSI, Chaintech, and Gigabyte nF3-250 chipset motherboards and found a working AGP lock on all these Socket 754 boards. We also found a working PCI lock on the 2nd revision of the Abit KV8 PRO based on the VIA K8T800 PRO chipset, but we had some issues with multipliers on that board.The same chipsets are used with the just-released AMD Socket 939 Dual-Channel processor. We are in the early stages of testing dual-channel Socket 939 motherboards, but we have been working with Asus on their A8V Deluxe based on the VIA K8T800 PRO chipset. Early revisions had no PCI/AGP lock and limited overclocking, but Asus made a hardware revision to shipping boards, which added a working AGP/PCI lock. The board has also improved through a number of beta BIOS to the point where the most recent beta BIOS has fixed many of the issues with overclocking on the Asus A8V. We have been able to achieve 1:1 overclocks as high as 265 FSB on the A8V with the latest BIOS.
While it is far too early to establish a standard memory test bed with a Socket 939 board, we have been experimenting with the working AGP/PCI lock to allow effective testing of Athlon 64 memory performance. With Intel moving to DDR2 in the upcoming 915/925X, we will likely move DDR testing to an Athlon 64 Dual-Channel test bed in the near future.
Athlon 64 Performance Test Configuration | |
Processor(s): | AMD Athlon 64 3800+ Socket 939 |
RAM: | 2 x 512MB Corsair 3200XL (DS) 2 x 256MB Samsung PC4000 (SS) |
Hard Drives | Seagate 120GB IDE 7200 RPM (8MB Buffer) |
PCI/AGP Speed | Fixed at 33/66 |
Chipset Drivers: | VIA Hyperion 4.51 |
Video Card(s): | ATI 9800 PRO 128MB, 128MB aperture, 1024x768x32 |
Video Drivers: | ATI Catalyst 4.6 |
Power Supply: | Antec True Power 430W |
Operating System(s): | Windows XP Professional SP1 |
Motherboards: | Asus A8V Deluxe (VIA K8T800 PRO) Beta BIOS 1005.020 |
In our testing with Socket 939 boards, we found the best performance is achieved at a tRas setting of 10. All Performance Tests were run with a 10 tRas setting.
Test Settings
The Athlon 64 also has the unique feature of unlocked multipliers below the rated speed on all processors, and both above and below rated speed on the FX chips. This feature is not currently available on Pentium 4 processors. The unlocked lower ratios combined with a working AGP lock makes it possible to take a different approach to testing memory performance on the Athlon 64. It is possible to look at a fixed Processor speed and variable Memory speeds to see the real impact of just higher memory speeds on typical performance.The standard Quake 3 (Open GL), Super PI (raw calculation performance), SiSoft Sandra 2004 Standard Buffered memory test (synthetic memory test), and SiSoft Sandra 2004 Standard UNBuffered memory test were run as usual. However, to test more effectively the effect of memory speed on performance, we expanded the benchmarks used for testing. UT2003 (Direct X 8) and Aquamark 3 (Direct X 9) were added to the memory tests to provide a broader range of performance measurements.
11 Comments
View All Comments
Ozone1 - Saturday, June 26, 2004 - link
Maybe I missed it, but in all your memory tests do you guys lower the multipler so that the system remains at a constant speed and then just increase the FSB? If you don't, your reviews would be far more beneficial if you did. Why? Well then it would be easier to see the performance differences when timing changes between the ram. Also, why don't you list the timing for each ram? It just feels like so much is left out of your ram reviews that they don't help me as much as they could... Thoughts?Pumpkinierre - Saturday, June 19, 2004 - link
Yeah Zvorak your imagination is spot on. But you have to have equal track lenghts (for correct timing) on the Mobo to the DIMM or VLSI unless you use serial transmission ie RAMBUS (which I like). So the closer your chip is to the cpu the harder this will be. Also, remember HT bus is 1000 MB/s which is slow cf. to present bandwidths of 3000MB/s unbuffered. Getting rid of system RAM altogether is the best idea and having a smaller size memory replacing the L2 cache (128Mb would be plenty) running at half cpu speed pumping a large L1 cache 256K would be the way to go. This could be on die or as you suggest a replaceable cpu style chip which would allow upgrades.Zvorak - Friday, June 18, 2004 - link
Pump, thnx for the open arms... lolAye, I agree with what yur saying about the A64 and the latencies... that was my over all point. Trying to speed up the throughput by increasing timing on the FSB won't do much IMO because the CPU is significantly faster... whereass riding your system of latencies so the memory can operate efficiently and on par with the CPU has shown to improve system performance.
I truely beleive with "Dual-Channel" memory FSB will become a thing of the past when dealing with tweaking your system... the faster the stick of Memory works internally is most important... but I cannot understand why a new type of memory isnt in the works such as a VLSI 1 gig "chip" that get dropped onto your MB with 2 ns Cas 1-1-1-x that just off the side of the CPU with a dedicated HT bus ...?
yes.. no ?
Lets get real... most Systems today are built with 512 or 1g of memory, why not remove the stupid DIMM slot and go with a second processor style slot that you drop in a Ram Chip ...? or am I too forward in my thinking..?
Pumpkinierre - Thursday, June 17, 2004 - link
Sorry Trog #3, I didnt get your remark on the P4. You're advocating using two cpus: a 3.4c and an o'clocked 2.8c at 3.4Gig and seeing the performance improvement of the latter. I was too zonked with a64 thinking. I have seen tests of o'clocked 2.4c@3.2 vs 3.2c when the 800MHz N'woods first came out but I dont know where (might have been AT). The memory could have been at 5:4 or slower RAM timings but from memory the 2.4c still came out on top (I dont know by how much- from visual memory of bar charts it was < 5% dont quote me).Yes Wesley #5, it would be interesting. In fact you have already done it once with the OCZ3700EB tests:
http://www.anandtech.com/memory/showdoc.html?i=205...
Here, there was only a paltry 3% increase for a 30MHz bus speed increase (~13%) and again not much performance correlation with unbuf.Sandra bandwidth. I think you gotta run your tests at lower video card settings (maybe 16bit) or use an X800 to truly see what is going on with memory latency.
Welcome to AT Zvorak, I only started posting (initially anonynomously) after my patience with AMD ended with the release of a single expensive heavy cached a64 last September (and I had to yell from the rooftops) despite reading AT for several years previously. Yes some people say what you say about memory timings and o'clocking bus speed but, to me, if the on die memory controller (which runs at cpu speed) is so fast then the bottleneck must be the system RAM. So faster memory latencies and RAM speed should benefit the a64 more so than other cpus eg P4 or A-XP where the Northbridge/ memory controller and RAM run at bus (FSB) speed (with the cpu at a much higher speed). So far only lowering memory latencies seems to do it and raising bus speed doesnt do much, but I have yet to see a comprehensive test on this. Its important because AMD multiplier OVERlock their a64 multipliers (not FXs) and the logic to me of this, is to encourage enthusiasts to increase their bus speeds (without o'clocking too much). Remember when you increase bus speed on a multiplier locked cpu like the P4 or later A-XP you increase both the cpu speed and the memory/NB speed but the a64 allows you to just increase RAM speed while holding cpu speed steady. This is of particular interest to individuals (like me) who are only interested in mild overclocks of their system. Still, cache plays a part in this and probably acts as a buffer especially in predictable apps. and tests. So unless it is something to do with memory controller tuning, it may be that the bus speed effect will be more relevant to the lower cached Paris/Semprons coming out soon or to actual gaming (give me an a64 system and I will let you know!).
The same arguments are given for the poor performance of encoding tests by the a64s. But the bandwidth tests (which are more relevant to this than demo tests for gaming)show the dual channel a64s to be as good as the P4s(with the quad pumped Netburst architecture) in the unbuffered tests and phenomenal in the buffered tests. People say its other bottlenecks- cache/pipeline thing but that doesnt cut the mustard with me. My belief is that the encoding software is tuned to the P4 architecture so if this were to change then the a64 would be streets ahead. Its quite an interesting and phenomenal cpu yet to realise its full potential. Its a pity it does'nt have Intel behind it.
Zvorak - Thursday, June 17, 2004 - link
Pump, I was under the impression that the A64 wasnt as dependent upon memory timmming as other CPU's .. and increasing the FSB will do very little to memory timing because the controller is on the CPU die... increasing the processor mhz is like OC the memory timing because it scales up with faster MHZ ...no ?BTW... my first post here at AT, but been here from the start !!
Zvorak - Thursday, June 17, 2004 - link
Wesley Fink - Thursday, June 17, 2004 - link
#4 -We ran DDR400 and DDR480 at the fastest timings that were supported at memory speed - primarily to answer the question of whether fast DDR400 is faster than slower-timed but higher speed memory. You bring up an interesting idea of running DDR400 and DDR480 at the same slower timings (2.5-3-3-10 in this case) but the same CPU speed of 2.4GHz (3800+). This would be an interesting set of benchmarks for a future memory review.
Since the Athlon 64 architecture does not use the same deep pipes used on P4, it is not as dependent on memory bandwidth for performance. It is ironic that the new 939 now has the highest memory bandwidth we have measured on any platform, but doesn't really need the added bandwidth for best performance. Consider it tremendous headroom for future development of the Athlon 64 family.
Pumpkinierre - Wednesday, June 16, 2004 - link
The 8% was referring to the unbuffered Sandra memory bandwidth increase not the actual bus speed increase which is 20%. Again the difference is probably due to the different memory latency timings. I wish Wesley would run the test using the same timings.If you increase bus speed, it should decrease latency if all settings stay the same. So it ought to improve performance similarly to lowering memory latency settings. I expected this to be the case for the a64 where the system RAM is the bottleneck. But this doesnt seem to be happening. I've seen only moderate increases (~3%) in the benchmarks with large FSB increases on the A64. It might be that the ondie memory controller is tuned to a particular speed and any bus speed change throws it out of sync. (or it could be the benchmarks-another story!) It tends to poison the purpose of unlocking the multiplier below stock.
Dont know about the P4, you'd have to get a multiplier unlocked one to test it out (which are pretty rare).
TrogdorJW - Wednesday, June 16, 2004 - link
Ummm... 8% is a little low, Pumpkin. 240 is 20% faster than 200. Still, it's rather surprising that Athlon 64 doesn't seem to care all that much about the added memory bandwidth. Then again, this has been shown to be the case in numerous instances: doubling the L2 cache from the 3000+ to the 3200+ doesn't produce a huge increase in speed, and dual-channel 939 boards are also not much faster than the single-channel 754 parts.It would be insteresting to see some results for P4 in the same type of test. I'm sure they're out there, but I haven't looked for them lately. A 2.8C overclocked to a 243 MHz bus would be the equivalent of a 3.4C in clockspeed, with the only difference being the increase in memory bandwidth and FSB speed. Unlike the A64, I'm pretty sure that an overclocked 2.8C would beat the 3.4C in most benchmarks.
Different strokes for different folks, I s'pose.
Pumpkinierre - Tuesday, June 15, 2004 - link
Errata should be "despite an ~8% increase in mem. bandwidth " in that last paragraph.