PS3 (Research/Experimental) - NEC/TOKIN Capacitors Replacement - YLOD

Yes, that's the post from Workz.

I just use a plain old probe, but I pop the grabber off the end to expose the needle probe underneath. I clip ground to the copper ground plane all around the edge of the board, and I just jam the probe in to the vias closest to the chipset after the TOKIN and hold it there. We're not looking for any kind of perfect measurements here, you're checking that the signal didn't clearly and fundamentally change over to either that weird sawtooth or a clear sine wave.

If I remember correctly from when I was dicking around with showing how many TOKIN you could remove entirely before it failed, I believe I got almost 500mv peak to peak of noise with half of them removed and still had a backwards compatible model running GT6 just fine. The vast majority of the noise was much smaller than that, but the instantaneous spikes were quite large.
 
Yes, that's the post from Workz.

I just use a plain old probe, but I pop the grabber off the end to expose the needle probe underneath. I clip ground to the copper ground plane all around the edge of the board, and I just jam the probe in to the vias closest to the chipset after the TOKIN and hold it there. We're not looking for any kind of perfect measurements here, you're checking that the signal didn't clearly and fundamentally change over to either that weird sawtooth or a clear sine wave.

If I remember correctly from when I was dicking around with showing how many TOKIN you could remove entirely before it failed, I believe I got almost 500mv peak to peak of noise with half of them removed and still had a backwards compatible model running GT6 just fine. The vast majority of the noise was much smaller than that, but the instantaneous spikes were quite large.
Then much of the noise is parasitic inductance and cross talk with that nice long antenna of a ground wire. It's hard to see the characteristic sawtooth waveform in your pics for the CPU with good tokins because of that noise. In the bad tokins, it's easy because they aren't doing their job. The noise is small compared to the VRM waveform. It would look a ton better with a ground spring and probing near the tokins. That will still be noisier than attaching to the vias on the underside of the board, but it is easier to accomplish and probably fine for qualitative testing. It would be nice to be able to see the voltage drop, the capacitor kick in, then the regulator restore voltage to the appropriate level. All in one transient event. Would need less noise and to zoom in on the horizontal axis for that.

For example:
o95OQEQ.jpg

FZhaEcN.jpg

In this example the capacitor acts like a battery, kicking in to limit the voltage drop and supply current long enough for the regulator to switch on and restore voltage to the specification. You can see a bit of overshoot/ringing. The harmonic noise from these transient pluses/ringing events passes right through the capacitor to ground, because of capacitors other advantage, their very low impedance to high frequencies (kinda like a pressure regulator). So not only does the capacitor reduce the voltage drop, it reduces the overshoot, ringing, and decouples the harmonic noise. Thanks to a bit of tuning in the second stage filter, this circuit is very efficient. Usually these circuits are carefully laid out, simulated and then iteratively tuned with real world measurements. It's a process. But that "Tuning" requires we respect the ESR and capacitance specifications SONY's engineers chose.

To evaluate if the capacitors we choose are appropriate, it would be nice to have the above graphs prove it.
 
It's been awhile since I grabbed all those images, but I remember soldering some, and I remember grounding as close as possible at first, but it didn't make any noticeable difference to the images, so then I just did whatever was easiest. Regardless, the pin and edge ground plane is more than sufficient for diagnosis from either side. I have two different testing jigs that hold the board different ways, so I've poked all over at this point.

I've got a board in the drying oven now waiting for a GPU. I might get to it tonight, so I'll dig around and see if I have an extra probe I'm okay with sacrificing to make a nice cable following that NXP document and see if it makes any real world difference.
 
Last edited:
...The failed capacitors can only be identified by oscilloscope under load. All the ones I found read perfectly within specs for capacitance and ESR when out of circuit.

Heating the caps to get your system to work doesn't prove anything. Not only is there ZERO documentation or evidence anywhere that heating a POLYMER cap causes it to regain capacitance, missing capacitance isn't the problem in the first place. You're also flexing the board by microscopic amounts no matter how isolated you think you made the area you heated. It will just as easily give you a false positive from a mechanical re-connection of a cracked joint. I even heated one of the failed sets myself WHILE CONNECTED TO AN OSCILLOSCOPE and observed no change whatsoever in the signal. Please stop spreading misinformation that this test yields any meaningful results.

If you're bothering to do the repair, please replace all of the TOKINs at once. Do it right the first time. While the exact mechanism of failure still isn't clear to me, the failed capacitors need to be removed. I even verified this by moving a bad cap to a console that was running fine with a missing cap, and this caused it to YLOD....

...I have no idea what page it's on, but I've shown, with 'scope images, that you should be going as close to the original specs as possible. Obviously, less capacitance made the noise worse, but adding more than the original amount also made it worse....

Finally, I'm still sitting at something like 3 bad sets of TOKIN confirmed in about 100 dead consoles. Sorry, it's all BGA defects here. I haven't found another bad set of caps in 6 months or more...
From page 82, I'm just going to re-post here for quick reference:
CPU (NEC/TOKINs)
zzNXQVs.jpg


CPU (5280uF, 1.5625mOhms ESR)
0ZZ5EAy.jpg


RSX (NEC/TOKINs)
zzNXQVs.jpg


RSX (5280uF, 1.5625mOhms ESR)
ioVngCI.jpg


RSX (5280uF 68.75mOhms ESR)
EtKC9SC.jpg
Those are the results I wanted to see, but the graphs are too far zoomed out to see what's really happening for each transient event. We only ever got GPU measurements with the 1.1mOhm low ESR caps (where the hell did you find any that low BTW?) You confirmed both significantly lower and higher capacitance than 4800uF is bad. Good job! Matching specification (close enough anyway) was the sweet spot and your measurements proved it. Very nice!

About heating of the tokins. Heat doesn't restore capacitance, it lowers ESR. That's a known thing. You wouldn't be able to measure ESR without removing them, so heating them while scoping them in situ wouldn't reveal anything out of the ordinary. EDIT: If the ESR rises you might see the voltage drop more, if the minuscule amount it changes isn't hidden by parasitic noise in your probing method. And we'd need to be able to see a single transient event to compare that initial voltage drop. The heat required to remove them would... you guessed it, temporarily lower the ESR. Possibly this explains why they test fine out of circuit. When you solder it back into a PS3 that was working fine with one removed, it now feels a real load coming, is unable to hold it's ESR, craps the bed and spoils the party (metaphorically speaking). In other words the transient voltage dropout, falls too low and triggers a YLOD. On your time scale it would look like a larger than normal transient downward spike. If you look at the RSX graph for the low ESR caps compared to the higher ESR caps, the downward spikes don't drop as far, but I can't tell if those spikes are just reflection from parasitic inductance, given your probing technique. They're probably meaningless and I'm just looking for confirmation (confirmation bias). However, your RSX NEC/TOKINs Vmin of 1.07v (-130mV) is significantly below the 1.2v RSX spec. and both of your tantalum replacements were 1.15v (-50mV). That might be significant.

So what you're saying is that the NEC/TOKINs are fine, from your measurements of YLOD consoles, and that something else is the problem (for example the BGA or bumps under the die)? That removing the NEC/TOKINS does nothing, unless we actually had a bad tokin to begin with (which is unlikely in your experience). We will not have repaired anything, just warped the board, applied more pressure or from a slightly different angle to temporarily make electrical contact. Basically, if we look at it crosseye'd it'll YLOD again.
 
Last edited:
Alrighty, the car is detailed and the garden is tended. I can dig in at the bench with a few beers and some cartoons for the rest of the night.

I just checked my order history on Mouser, and apparently I got the 1.1 mOhm caps from Typo-land. They're 1.1 Ohm.... and still worked. https://www.mouser.com/ProductDetail/74-TMCMB0E337MTRF

I can't find anything that says ESR is lowered by heat. The only thing I can find is this Q&A from TDK, in reference to their MLCC capacitors, that says heat does not affect ESR: https://product.tdk.com/info/en/contact/faq/faq_detail_D/1432616873696.html

Regardless, any change in any operating parameter is going to show up in use on the 'scope, and they never changed. I attribute this entirely to actually diagnosing them as the failed part first, where everyone else is guessing blindly.

Yes, I've found it extremely rare. For about 6 months, I changed the caps on every console that wasn't revived from reballing the GPU and whatever other various fixes it needed, and it never brought one back to life. Since using the scope to diagnose, I've found 3 total systems that were bad (2 of them were both the GPU and CPU at once if I remember right - another reason to just do it all at once). I stopped keeping track officially, but it's roughly 3 in 100 for me right now.

I honestly can't remember the last time I lifted the GPU on a backwards compatible model and didn't see evidence of a BGA defect. I've become convinced that even perfectly working 90nm consoles already have some cracks, they're just still making good mechanical contact. I mentioned a hundred pages ago that I've seen PS4 boards where just moving the board already in the clamp to a different spot on my desk would cause it to switch between working and failing before reballing it back to life.

I'll report back in about 2 beers. Even if this fatty doesn't come back to life, I can still grab some images from it.
 
Last edited:
Just did a little digging and I think your right about the ESR thing. The only thing I can find is that ESR decreases with temperature for aluminum electrolytic, which is likely due to the liquid dielectric. ESR in those tends to increase as the electrolyte dries up and that's probably where my mind was pulling the ESR would decrease with heat thing. But it was word vomit as it shouldn't apply to any other type of capacitor. So scratch that.

Holy crap 1.1Ohm/16 = 68.75mOhms! And the dam thing booted? Well, not too long ago I was trying to figure the Z_target for this circuit and all I could find was a general line drawn in a graph during a KEMET webinar @ 100mOhms, so at least it's under that...lol! Actually that makes me wonder if it would boot if you used 11 (100mOhms). My new Z_target is based on that NXP paper and your scope readings, but a target isn't an upper limit. Who knows how high it can go. It couldn't have been good for heat performance though. The thing probably would have gone exothermic on you had you left them on too long, with tantalum metal reacting with the MnO2 to explode once the heat got high enough.
 
Yes, that one booted and idled at the main menu just fine. I didn't try to run any games on any of them though.

This CECHG01 is back to life after a reball. I swear, I don't buy anything but backwards compatible models, but every time I'm screwing around with experiments here, I have a non-BC laying around.

But... I can't make the cable. The only BNC connectors I have are on my scope probes, and they are apparently not made the way I thought they were. I snipped off the probe end, and it still has 150 ohms of resistance just in the connector and shielded cable. Not only that, but the central wire is apparently made of some magical alloy that won't let me solder to it. Sanded, dipped in flux, a million degrees, no dice.

I did try probing another half dozen spots, grounds as close as possible, 10x and 1x, and no change. I can't get much more resolution, either. My scope only goes down to 10mv. I can grab a plain BNC and some regular shielded cable for a few bucks, but I suspect it's still going to look the same. I think those images are wishful thinking on a perfect simulated circuit and not a real world device with a million other parts mucking things up in various ways.

I've got a CECHA01 waiting, so we'll see next week.
 
Try this on the 2 ends of a MLCC on under the processors (one of the 0.1uF bypass caps). If it doesn't improve the noise it doesn't improve the noise.
iu
 
...Oh man, now we're getting to some good, techy stuff. I told myself I'd take the night off from any more repair attempts, but couldn't resist finding out if replacing all nec's would finally allow boot or not... I removed two of my caps from each group I had installed yesterday. It went fine. I then finally removed all the remaining nec's from the other(chip) side of the board. It went fine. Exhausted, I told myself I'd stop(been messing around with this after full days of work and not much sleep), but I didn't. I had some thin pc's of FR-4 sputtered with copper and cut them about size of the nec's, removed the copper in a thin line down the middle to separate a hot and ground side, installed five of the tantalums over the copper free troughs, checked capacitance and then installed all four on the board with two leads to hot on one side and two leads to ground on the other. I don't have my pics or notes at the moment, but I think each little board was measuring about 1.4 to 1.6 milliOhms . BUT, one board was measuring ~4.6 so I removed all the caps and tested each, and some were really really high from what they should've been?!? SO used more brand new ones and finished them up. I put one of these inplace of each tokin, and had 8 individual caps still in place on the other side of the board per chip for a total of 18 per chip. I cleaned it up, sat the board ontop the fan in the lower case, sat the metal sheild ontop, connected just psu and plugged it in. Turned it on,
same ~3 sec of green until ylod. Tried it one more time and just as I hit the power button there was this LOUD beep, beep, beep, beep beep, beep, beep, beep(also ylod) and I thought "what the hell?? nobody mentioned this audible alarm before! It gonna effing blow or something!!" I yanked the cord out, picked it up, still beep,beep,beep,beep beep,beep,beep,beep (groups of four loud beeps with a pause inbetween. "Oh$hit, this thing's NOT happy", ran over to the table with it, set it down and unplugged and ripped the psu off. STILL the beeps! I help the psu up to my ear, didn't sound like it, picked up the rest of the unit, didn't sound like it was coming from it either, was running around the room looking at the ceiling for a fire alarm or co2 sensor or something that was going off, everywhere I went it was still as loud!! By this time I was down the hall and in another room and about to go mental, then I reached down in my jacket pocket, forgot I had a small pocket radio in there I was using earlier on the bench. It's a little am/fm radio with a speaker in it for listening without earbuds that I just recently got and started using...It was the damn alarm I didn't know was set on the radio!!!! OK, packed up my crap and went home. Lesson learned, know when to call it quits from fatigue...Hard to tell what I might've missed or messed up with my repair, probably nothing, but it'll have to wait for another day.
 
Alrighty, that seems to have made a pretty big difference, but it doesn't really look ....helpful.... to me. Seems like it just cut out the big spikes, but the overall idea is the same. Here's a perfectly working (YLOD, checked caps my lazy way prior to reball, reballed, working perfectly now, checked caps again after reball and they were identical to prior to reball) CECHG01 GPU, and I had to go all the way to the limit of 10mV. Gimme a few minutes and I'll toss the CPU up. I have to make separate posts or I'm sure I'll mix them up.

sDnUHlb.jpg
 
CPU side doesn't have any spots that are apparent to me without diving in to a schematic at 1:30 in the morning with a few beers in me, so it's bed time. I pulled out a scrap CECHA01, and those are really obvious. If you want the CPU side, you're gonna have to find a schematic of a CECHG01 and send me a picture of where you want it, cause they're all unpopulated (and gave me the same image as my lazy way or no reading at all) or won't tone out to where they need to be. Either that or wait until next week for the next CECHA01 and I'll make the proper cable by then.
 
Last edited:
@squeept Thanks for sharing your experiences and running these test!

I was wondering what are some of your steps for determining the GPU is at fault, before moving to reballing the chip?

Regarding the reball of the GPU, what kind of equipment are you using? Also what temp profiles are you targeting to do your reballs?

Apologies for all the questions. I have a few spare boards I'd like to attempt this with, but don't want to do it without better diagnostics of the actual problem and more information on the process specifics. I do have a jig, stencils, access to rework equipment, etc.
 
...CECHG01 GPU...
sDnUHlb.jpg
That looks much better. The probing technique has improved your Vpp to 20mV, which seems more reasonable than the 170mV you were getting before. All the other automatic measurements have tightened up too. Also, your vertical axis is zoomed in more than it was before, so those spikes before would fill the screen at this resolution. This would definitely to reveal more nuggets if they're there to be found. It's possible that any difference to be seen before, between NEC/TOKIN and Tantalums, was obscured in the noise. So this is good (not for the myth/fix). It's clear enough to make out the sawtooth pattern now, at least. It was mostly obscured in the noise before.

Are you fully zoomed in on the time scale (H = 1uS)? It's still a bit squished together to make out what's happening on a single fall/rise event. This is to be expected for a power delivery system to a processor. They're difficult to scope (need a good scope and probing technique). It won't get better than this without a direct soldered connection to a +/GND BGA VIA. I wonder if that 50Ohm resistor they use increases the Vertical amplitude? That might make it easier for your scope. IDK.

CPU side doesn't have any spots that are apparent to me without diving in to a schematic at 1:30 in the morning with a few beers in me, so it's bed time. I pulled out a scrap CECHA01, and those are really obvious. If you want the CPU side, you're gonna have to find a schematic of a CECHG01 and send me a picture of where you want it, cause they're all unpopulated (and gave me the same image as my lazy way or no reading at all) or won't tone out to where they need to be. Either that or wait until next week for the next CECHA01 and I'll make the proper cable by then.
CECHG models started using the 65nm CELL BE, so it's reading wouldn't be applicable to earlier 90nm CELL BEs. Also thanks to the smaller architecture the CPU on G and later models should run cooler (theoretically). Meaning that RSX failures would be more common in these models, whereas 90nm CELL BEs run hotter than the RSX, at least it does on A models. I'd wager you probably find more CPU failures on the A, B, and E models? What's the ratio of RSX to CPU failures in the early models? I'll bet it's closer to 50/50 than the later models. I would imagine then you reball both for early models, just to cut down of unsatisfied customers. I'm more interested in the A01, since that's what I have and all I'm interested in owning a PS3 for anyway (PS2 hardware BC). So I'll eagerly await those measurements.
 
I'll hold off on any more poking around until I have another BC and a nicely made cable ready. I'll log back in and remember to tag you when I do.

========

I do things a little differently. Since these aren't my personal consoles, I don't have any emotional attachment to them. I need to make money, so I force myself to stop looking once the odds of a dead GPU have gone through the roof. And, since I honestly believe every 90nm console has BGA defects on the GPU and I provide a 6 month warranty that I don't want people to have to use, I reball every one of them even if they work. They sell for more, which makes the time worth it. So, I have the "luxury" of using reballing as part of the diagnosis.

My current workflow is something like this:

-> Clean the board
-> Check every inch of both sides under the microscope for burned components, water damage, physical damage, and missing components from the last idiot that opened it up
-> Put it in a jig with known working components
-> Pressure test all the major chips
-> Save a base image of the TOKIN output so that I know if I hurt their health during reball
-> Drying oven then lift GPU
-> Ohm test GPU then reball
-> System is now working :)

If I actually have to keep going from here, I'll usually just poke around for a half hour or so looking for shorts and following voltages around, and a quick check with a thermal camera. Once you know your TOKINs are good and a chip is reballed, the heat test will actually provide useful information - if it works, then the chip / bumps are definitely toast. Unless something gives me a hunch, the odds of a dead GPU are just too high now, so I'll stop after that.

The CPU is rarely an issue. I don't pre-emptively reball it, and there are usually no signs of BGA defects when I do take the time to lift them off. I'd guess 1 in 20 consoles that come through here? I only reball them if they respond to the pressure test, I find torn pads on the GPU, or there are other signs that the console was dropped. My guess is that the CPU holds a more stable temperature. I believe the GPU has more fluctuations both in overall temperature, and in tiny localized sections, so it essentially has tons of mini heat cycles thus it experiences a lot more thermal stress. This is all conjecture, though.

I have an ACHI IR-PRO-SC because it's cheap but good quality, made almost entirely of "off the shelf" components, and at the time I bought it there was a huge community of helpful people that had one on the BGAmods forum. Every machine is going to behave differently, even between the same models, so you need to work out your own profile following a few basic guidelines. With my machine, it goes bottom pre-heat to a stable 167C measured on top. Top heat until 227C. Don't exceed 1 degree increase per second. Cut heat, cool gently by pulsing the fans every few seconds until around 180C then fans full blast until room temperature.

Once I finally find another bad TOKIN, I'll learn the SYSCON stuff so hopefully we can verify everything in that thread, and make it the defacto diagnosis method for everyone that doesn't have access to all these tools and machines.
 
Once I finally find another bad TOKIN, I'll learn the SYSCON stuff so hopefully we can verify everything in that thread, and make it the defacto diagnosis method for everyone that doesn't have access to all these tools and machines.

Looking forward to that! it would be amazing to validate the error codes.
 
That NXP paper was talking about a shielded BNC male to male coaxial cable that features 50 Ohms and low attenuation. There is no resistor. They just soldered the shielded center pin to a + VIA and then cut the braided copper shielding into a wire shape and soldered it to the adjacent GND VIA:
517iWFhd5sL._AC_SL1100_.jpg

I got tired of not owning an oscilloscope. So I just pulled the trigger on this:
61cGhQ0begL._SL1024_.jpg

Should be a good entry level scope. Many of it's features are trial and need to be unlocked (for lots of money), but there are ways around that. Ordered one and that coax cable above. If I can teach the scope what a NEC/TOKIN is supposed to look like then I should be able to make a PASS / FAIL profile that would make diagnosing them easy.
 
That's the exact model I've got. It's definitely good enough for diagnosis with just the pin probe. I'll try to get my chopped up probe sorted out one more time tonight before I toss it in the trash.

I only upgraded to that one a few years ago. I had been using an old analog Philips with a CRT screen since like 2005. It weighed about 100 pounds, and had a nice big sticker on the back warning about the x-rays it emitted. The phosphor would blind you if you weren't careful with the brightness controls. Quite a change in technology.
 
@squeept Thank you so much for sharing your methods on diagnosis and repair! The one item in there I wasn't sure of though, what is the 'pressure test'? Is that just applying some pressure to the chips to check for faulty contacts?

@RIP-Felix Congrats on pulling the trigger on the scope!! Can't wait to see what you learn from it!
 
Yep, I just have a couple of 1kg powder coated weights that sit nicely on top of heatsinks while it's in a jig. Whatever works, as long as the weight will stay put and won't short things out if it falls on to the board. You can even just press on the chips, but I imagine that would be rather inconsistent.
 
Hi everyone!

First of all thanks for all the sharing here. This topic permitted me to revive my PS3. The job was not very well done thought. My soldering iron is too big for these kind of jobs and the capacitors are very small so it was a nightmare to soldering that. Another nightmare was to remove the NEC capacitor.

Everything done and assembled the PS3 turns on but it has a problem. First the CPU temp while on PS3 Control Fan Unit software is around 68ººC and RSX is around 49ºC. CPU seems a little hot while on idle but let's ignore that for now. Now the real problem: HDMI output stopped working. I can only output image by AV cable. I don't know why how this happened. What I know (maybe it is related or even the problem) while I was soldering the caps some "small" piece (little circle in the image below, use the zoom to see it) de-soldered and I lost it. I don't know if it is a resistance or a capacitor or even if I'm able to fix it. My PS3 is a 80GB EUR model CECHK revision. Hope you can help me! Thanks for your attention and best regards!
DIA-002-1-876-912-22-CECHJ-CECHK-BACK.jpg
 
Back
Top