PS3 Fault finding YLOD with the SYSCON - First steps and Error reporting

[QUOTE="
In any case, now we know to expect a 2120 if the HDMI cable is plugged into the console when there is a BGA defect (3034) - That it doesn't mean there's a problem with the HDMI encoder. And that if there is, it'll show up in a different context (like the cable is unplugged or there are no BGA defects).[/QUOTE]

No needs HDMI cable plugged in 2120. I got a 2120 without HDMI cable plugged.
08/08/2021
errlog
ofst[ 12]:err_code:0xffffffff, clock:0xffffffff
ofst[ 16]:err_code:0xa0202120, clock:0xffffffff
ofst[ 20]:err_code:0xa0202120, clock:0xffffffff
ofst[ 24]:err_code:0xa0061002, clock:0xffffffff
ofst[ 28]:err_code:0xa0202120, clock:0xffffffff
ofst[ 32]:err_code:0xa0202120, clock:0xffffffff
ofst[ 36]:err_code:0xa0202120, clock:0xffffffff
ofst[ 40]:err_code:0xa0202120, clock:0xffffffff
ofst[ 44]:err_code:0xa0202120, clock:0xffffffff
ofst[ 48]:err_code:0xa0202120, clock:0xffffffff
ofst[ 52]:err_code:0xa0202120, clock:0xffffffff
ofst[ 56]:err_code:0xa0202120, clock:0xffffffff
ofst[ 60]:err_code:0xa0202120, clock:0xffffffff
ofst[ 64]:err_code:0xa0202120, clock:0xffffffff
ofst[ 68]:err_code:0xa0061002, clock:0xffffffff
ofst[ 72]:err_code:0xa0202120, clock:0xffffffff
ofst[ 76]:err_code:0xa0202120, clock:0xffffffff
ofst[ 80]:err_code:0xa0202120, clock:0xffffffff
ofst[ 84]:err_code:0xa0202120, clock:0xffffffff
ofst[ 88]:err_code:0xa0202120, clock:0xffffffff
ofst[ 92]:err_code:0xa0202120, clock:0xffffffff
ofst[ 96]:err_code:0xa0202120, clock:0xffffffff
ofst[100]:err_code:0xa0202120, clock:0xffffffff
ofst[104]:err_code:0xa0202120, clock:0xffffffff
ofst[108]:err_code:0xa0202120, clock:0xffffffff
ofst[112]:err_code:0xa0003001, clock:0xffffffff
ofst[116]:err_code:0xa0003001, clock:0xffffffff
ofst[120]:err_code:0xa0003001, clock:0xffffffff
ofst[124]:err_code:0xa0003001, clock:0xffffffff
ofst[ 0]:err_code:0xa0093004, clock:0xffffffff
ofst[ 4]:err_code:0xa0093004, clock:0xffffffff
ofst[ 8]:err_code:0xa0901002, clock:0xffffffff
[mullion]$


With 3034 I got a 2102:
ofst[ 8]:err_code:0xa0403034, clock:0xffffffff
ofst[ 12]:err_code:0xa0232102, clock:0xffffffff
 
Without some more digging we can't be sure.
  1. What model?
  2. How long is the YLOD?
  3. Did you inspect the board for physical damage?
  4. Did you find any shorting caps, open fuses, etc?
  5. Did you confirm proper voltages on the CPU/GPU?
I'm not sure about a 3031, but a 3034 is usually an RSX<-->Cell RX/TX line. There is a break in the data line that connects the two processors. It could be the BGA on either chip (usually the RSX). And it could be the bumps on the die itself or the RSX RAM (although I suspect we'd get a different error if it were the RAM).

But yours is a 3031, which could be a bit different. If your model of PS3 has a mullion SYSCON you'll need to gain internal access, but I would like to see the bringup, errlog and lasterrlog commands. If there is a "BitTraining Error" it could tell us where the problem is. If your console has a Sherwood SYSCON, log in using "auth" (lowercase) and use the bringup and errlog commands.

My post about 3034 (https://www.psx-place.com/threads/f...-and-error-reporting.30100/page-4#post-249427) I pressed the line between CELL and RSX.

3004 is a NEC tokin error. When there is too little capacitance and the noise is too high, it triggers a 3004 power failure. Try double checking for cold solder joints and be sure the solder flowed correctly. That ground plane is a real heat sink, so if you're iron can't power through it and give you nice shiny round solder blobs, you may need to add hot air in the area to help.

Just keep in mind that the heat can hide BGA faults until the strain relaxes. So expect false positives.

I believe pin 7 (POWERGOOD) of IC6201 (RSX) and IC6103 (CELL) is an important tip. If you have a 3003 (CELL) or 3004 (RSX) check this pin and it is probably at 0V, in normal condition it should be at 3.3v. I will retest by injecting 3.3v when it is 0v to make sure the 3003/3004 error goes away. I took this test over 1 year ago but I don't remember

I used 32 of the Panasonic EEF-GX0E471L. I ordered them from here.

https://www.tti.com/content/ttiinc/...er=EEF-GX0E471L&autoRedirect=true&minQty=3500

Per some reddit comments I saw that this had a middle negative terminal which was appealing.

https://www.reddit.com/r/consolerep...these_the_right_capacitors_for_phat_ps3_ylod/


Between that and availability that is why I chose them.

No 1002s.



The 3001s are from me not having the PSU on the 12 volt prongs for testing.

3001 - No 12v.
 
Last edited by a moderator:
Reposting this post since no one responded to it. Thanks

Hi,
I did the syscon error log last night and got these errors. I have a COK-001 fat PS3. It's giving me the RLOD (not YLOD). I'm assuming a reball is in order due to 3034 error code. Could someone confirm this is correct? Could it be anything else? Thank you.

===================================
ERR 00: 00000000 A0403034 289FB4DA
ERR 01: 00000000 A0403034 289FB4AF
ERR 02: 00000000 A0403034 289FB4A5
ERR 03: 00000000 A0403034 288F71E3
ERR 04: 00000000 A0403034 288F7035
ERR 05: 00000000 A0403034 288F7014
ERR 06: 00000000 A0403034 1C376D10
ERR 07: 00000000 A0403034 19013BA9
ERR 08: 00000000 A0902120 18EABAC8
ERR 09: 00000000 A0403034 18EABAC8
ERR 10: 00000000 A0801601 18E592E8
ERR 11: 00000000 A0801701 18E592E8
ERR 12: 00000000 A0801001 18E2D3F7
ERR 13: 00000000 A0801001 18DECEB7
ERR 14: 00000000 A0801001 18D2F46C
ERR 15: 00000000 A0801001 1887E7A9
ERR 16: 00000000 A0801001 185A3B1F
ERR 17: 00000000 A0801001 176BFCC3
ERR 18: 00000000 A0801001 175A9ADF
ERR 19: 00000000 A0801001 16FF1A1C
===================================
 
Reposting this post since no one responded to it. Thanks

3034 is associated with RSX issues, yes. If the issue is a BGA defect, a reball/reflow is necessary (preferably reball).

However, association isn't causation! While the RSX is the usual culprit, the BGA defect could also occur on the CPU. So we can't be 100% sure. It is more diagnostic than any other method so far. If a reball fails and a CPU reflow fails, there could be a break on the FlexIO traces, a defect on the solder bumps that connect the die, or internal shorts due to electromigration. Once the RSX is off the board (necessary for a reball) you can ohm test the chip to see if it's okay. If it's bad, the RSX can be replaced. CPU cannot (not easily).
 
Ok I had a electronics board company remove the RSX and they said its reading 1000ohms without the chip. They are waiting for my confirmation to solder the new RSX chip on as it will be additional cost. Please advise whether we think the board is good or not and warrants a new RSX chip - all and @RIP-Felix @Pacorretaco
Good, if your short is no longer in the board, then you can check in the RSX itself to see the 0 ohms.

That's when you can make a nice RSX keyring.
 

Attachments

  • rsx_messpunkte_spannungsversorgung(1).png
    rsx_messpunkte_spannungsversorgung(1).png
    12.9 KB · Views: 84
Hey there, this specific system went from YLOD to GLOD after reflowing the GPU. At the time I concluded that the GPU needs replacing. However I can't help but wonder if I may have fell short by not cleaning the flux residue from underneath the chip very well, which I assume could have messed with a signal or something. You can only go so far with isopropyl, a toothbrush and an air compressor... Not that these cleaning tools won't work, they have worked for me... I just can't help but wonder if they were insufficient when things don't end up working.

Anyways, last month I got myself a gigantic ultrasonic cleaner with some distilled water and branson EC solution to pretty much rule out any human error in the cleaning process. I have yet to use it on a PS3 that's been reflowed, but on all the other boards so far it has been a huge time saver, and the boards look MINT afterwards. I wish I got myself an ultrasonic cleaner the day i picked up soldering!

Also I have been keeping records of every system i've been working on still, I haven't forgotten about that. I'd like to share all of that information in one go later on. I'm just curious to see how much better my success rates can get after putting my boards through the cleaner. I'm also looking to try my hand at reballing again when my non-POS reballing jig arrives, but it's coming from China so who knows how long that'll take.


Hi again!! back with some news, I went ahead and tried the reflow method, after assembling back and trying it for the first time it went live again but it froze on PS3 boot screen, tried a couple more times and I am getting either no display or 3 beeps.

I searched again in the error log of the syscon and found this 2 new codes:
ofst[ 4]:err_code:0xa0801701, clock:0x0b4886da 2005/12/31 00:01:30
ofst[ 8]:err_code:0xa0801601, clock:0x0b4886da 2005/12/31 00:01:30

Do you have an idea of what is going on here?
 
Hi!
I would like to ask if the VER-OOX boards have a serial connection, specifically the VER-001. Since the PS3 fat that I have been working on still shows a YLOD even after replacing 2/4 of the NEC caps. Also, I've discovered a SYSCON chip in the board, for this board it was SW-301 (i'm not sure if it varies). Is this the main syscon chip itself, and if so, does it have pins that provide serial connection?
 
Hi again!! back with some news, I went ahead and tried the reflow method, after assembling back and trying it for the first time it went live again but it froze on PS3 boot screen, tried a couple more times and I am getting either no display or 3 beeps.

I searched again in the error log of the syscon and found this 2 new codes:
ofst[ 4]:err_code:0xa0801701, clock:0x0b4886da 2005/12/31 00:01:30
ofst[ 8]:err_code:0xa0801601, clock:0x0b4886da 2005/12/31 00:01:30
Do you have an idea of what is going on here?
We've been seeing those codes alot recently. They're not very specific, meaning they can occur with various issues. I've seen them pop up after reflows before (if I remember correctly). Given they appeared after you reflowed, my guess is your reflow attempts failed to wet all the joints (actually bond to the pad).

Alternatively, the heat from the reflow, depending on how you performed it, can place strain on other components, like the CPU's BGA or the RAM. The bumps or ram could be shot. Unfortunately, those errors don't give us enough clues to go off of. What I do know is that they are occurring in a power state 80, which means the system made it past POST. The power up sequence and BitTraining occur during POST. The BGA pads that usually crack first and cause a YLOD are Voltage and FlexIO data lines. So your reflow re-connected those and is erring in the OS.

A YLOD can change to a GLOD for various reasons. A problem with any of the following will cause a GLOD:
1. CPU RAM Issues
2. RSX RAM Issues
3. AV / HDMI Issues
4. OS Software Issues
5. HDD Issues

Did you thoroughly clean the BGA with electronics contact cleaner (before and after the reflow)? What flux did you use? What equipment did you use (heat gun? Bottom heater? What settings?). What temperatures specifically?! Did you dry the board and if so how long at what temps? Did you nudge the chip to confirm it went molten? These details are important. If you performed the usual "heat gun special," then there are MANY ways it could have failed. Youtube videos like that Ifixit one are terrible advice.

Like @Sampsonay said, cleaning before/after a reflow is very important to achieving a good result. And if you didn't thoroughly dry the board, then moisture can popcorn IC's and squeeze the solder out from underneath the RAM's underfill. That could certainly cause a GLOD! Even if you did give the reflow the best chance of success, reflows can't remove oxidation off BGA pads or replace the old lead free solder like reballs can. So they'll always be inferior.

It's possible a software issue/OS corruption can cause a Livelock situation. Have you tried rebuilding the database? Can you even access the safe boot screen?

1601 is a "Livelock" situation. It's "where a request for an exclusive lock is denied repeatedly, as many overlapping shared locks keep on interfering with each other. The processes keep on changing their status, which further prevents them from completing the task." Imagine 2 people in a rush to get somewhere pass each other on the street. They both need to run to make an important appointment. They have to run past each other, but both of them swerve in the same direction, blocking one another, and have to stop. Then they try again choosing the same direction and have to stop again. Over and over again. Neither can get anywhere until they can agree on a direction and by the time do they've both missed their appointments. That's a crude analogy, but basically the process that needs to complete can't. Hardware and software issues can cause this situation.
 
Hi!
I would like to ask if the VER-OOX boards have a serial connection, specifically the VER-001. Since the PS3 fat that I have been working on still shows a YLOD even after replacing 2/4 of the NEC caps. Also, I've discovered a SYSCON chip in the board, for this board it was SW-301 (i'm not sure if it varies). Is this the main syscon chip itself, and if so, does it have pins that provide serial connection?

Yes, I made a tutorial here that shows where to connect to the SYSCON for every model of PS3. Just follow the tutorial and you will have the errorlog in no time!
 
We've been seeing those codes alot recently. They're not very specific, meaning they can occur with various issues. I've seen them pop up after reflows before (if I remember correctly). Given they appeared after you reflowed, my guess is your reflow attempts failed to wet all the joints (actually bond to the pad).

Alternatively, the heat from the reflow, depending on how you performed it, can place strain on other components, like the CPU's BGA or the RAM. The bumps or ram could be shot. Unfortunately, those errors don't give us enough clues to go off of. What I do know is that they are occurring in a power state 80, which means the system made it past POST. The power up sequence and BitTraining occur during POST. The BGA pads that usually crack first and cause a YLOD are Voltage and FlexIO data lines. So your reflow re-connected those and is erring in the OS.

A YLOD can change to a GLOD for various reasons. A problem with any of the following will cause a GLOD:
1. CPU RAM Issues
2. RSX RAM Issues
3. AV / HDMI Issues
4. OS Software Issues
5. HDD Issues

Did you thoroughly clean the BGA with electronics contact cleaner (before and after the reflow)? What flux did you use? What equipment did you use (heat gun? Bottom heater? What settings?). What temperatures specifically?! Did you dry the board and if so how long at what temps? Did you nudge the chip to confirm it went molten? These details are important. If you performed the usual "heat gun special," then there are MANY ways it could have failed. Youtube videos like that Ifixit one are terrible advice.

Like @Sampsonay said, cleaning before/after a reflow is very important to achieving a good result. And if you didn't thoroughly dry the board, then moisture can popcorn IC's and squeeze the solder out from underneath the RAM's underfill. That could certainly cause a GLOD! Even if you did give the reflow the best chance of success, reflows can't remove oxidation off BGA pads or replace the old lead free solder like reballs can. So they'll always be inferior.

It's possible a software issue/OS corruption can cause a Livelock situation. Have you tried rebuilding the database? Can you even access the safe boot screen?

1601 is a "Livelock" situation. It's "where a request for an exclusive lock is denied repeatedly, as many overlapping shared locks keep on interfering with each other. The processes keep on changing their status, which further prevents them from completing the task." Imagine 2 people in a rush to get somewhere pass each other on the street. They both need to run to make an important appointment. They have to run past each other, but both of them swerve in the same direction, blocking one another, and have to stop. Then they try again choosing the same direction and have to stop again. Over and over again. Neither can get anywhere until they can agree on a direction and by the time do they've both missed their appointments. That's a crude analogy, but basically the process that needs to complete can't. Hardware and software issues can cause this situation.


Answering some of the questions, I did use flux before and after the reflow, I used a heat gun for 3 min approximately until I reached 235 Celsius, removed IHS from both CPU and RSX before using heat gun, let the board dry for 2 complete days, I did not nudge the chip.

About the safe boot screen I have not tried it yet, thanks for the advice, will try later.

Will keep you updated.
 
Yes, I made a tutorial here that shows where to connect to the SYSCON for every model of PS3. Just follow the tutorial and you will have the errorlog in no time!

Thank you. You are a lifesaver!

Also, I would like to ask if whether chipping off an SMD cap near one of the NEC caps would possibly trigger the YLOD error. Since, while performing the replacement, I accidentally chipped off an SMD cap near the region. This led me to think if this also contributed to the YLOD.

Here is a link containing a picture of which specific SMD cap I chipped off (the one with the blue marker):
https://ibb.co/bdF7zvB

Sorry for my bad english.
 
Just about 1601.
Had one jsd board ylod with multiple 3034 4402 and few random 1701 and 1601.
After reball cpu/rsx , board will start in glod and stay on without image. Giving errlogclear command, nothing after many boots.Test reset image/recovery can hear beeps.Seen SB starting debugging on cmd, so rsx (with Samsung memory) fried. 4th one this year.
I took measurements and this was tending to vary from 1,6 ohms to 1, 8 ohms VDDC .
His values http://s.go.ro/3dcxqqpx
Exchanged rsx(with hynyx memory) from one kte001 and board fixed. This rsx was 2.2 out of board and 2.7 on board Some rsx will work fine with 1.8 ohms but best is 2 and up.
Some rsx may be saved, some not, in some cases on slims errors won't be displayed when rsx ic have internal issues.
Had dyn001, sur001, jsd, jtp. Didn't see this issue on kte001 and up.
At least 3 of them had 1701 and 1601 combined with glod and ylod.
 
Last edited:
Thank you. You are a lifesaver!

Also, I would like to ask if whether chipping off an SMD cap near one of the NEC caps would possibly trigger the YLOD error. Since, while performing the replacement, I accidentally chipped off an SMD cap near the region. This led me to think if this also contributed to the YLOD.

Here is a link containing a picture of which specific SMD cap I chipped off (the one with the blue marker):
https://ibb.co/bdF7zvB

Sorry for my bad english.
It's okay to lose that cap. They are part of a larger array, so loosing 1 isn't a big deal.

Your English is very good. No worries!
 
Just about 1601.
Had one jsd board ylod with multiple 3034 4402 and few random 1701 and 1601.
After reball cpu/rsx , board will start in glod and stay on without image. Giving errlogclear command, nothing after many boots.Test reset image/recovery can hear beeps.Seen SB starting debugging on cmd, so rsx (with Samsung memory) fried. 4th one this year.
I took measurements and this was tending to vary from 1,6 ohms to 1, 8 ohms VDDC .
His values http://s.go.ro/3dcxqqpx
Exchanged rsx(with hynyx memory) from one kte001 and board fixed. This rsx was 2.2 out of board and 2.7 on board Some rsx will work fine with 1.8 ohms but best is 2 and up.
Some rsx may be saved, some not, in some cases on slims errors won't be displayed when rsx ic have internal issues.
Had dyn001, sur001, jsd, jtp. Didn't see this issue on kte001 and up.
At least 3 of them had 1701 and 1601 combined with glod and ylod.
Yeah, I am beginning to suspect issues with RSX RAM causes those errors at a higher rate than other issues that can cause them. Especially when associated with a reflow, I think we can be reasonably sure that's what's wrong. This is one of the errors you don't want to see after a reball or reflow, as it means your RSX is hosed!
 
Yeah, I am beginning to suspect issues with RSX RAM causes those errors at a higher rate than other issues that can cause them. Especially when associated with a reflow, I think we can be reasonably sure that's what's wrong. This is one of the errors you don't want to see after a reball or reflow, as it means your RSX is hosed!
The 1601/1701 errors themselves mean very little. Check stop. They are kind of like a BSOD on a Windows PC (It can be just software, or any hardware error that interrupts the system while it's working (Prefix 80). I've seen these errors after freezing because of corrupt hard drive data.

The catch is when they come together with the 3034/44XX hehehe. And yeah, if a reballed RSX gives these... It's another candidate for a keyring.

If it's only a dodgy reflow, you are more or less where you started, which is good because you didn't completely kill it yet, so you are in time to stop.
A proper reball may fix the problem... Or maybe not.
 
We've been seeing those codes alot recently. They're not very specific, meaning they can occur with various issues. I've seen them pop up after reflows before (if I remember correctly). Given they appeared after you reflowed, my guess is your reflow attempts failed to wet all the joints (actually bond to the pad).

Alternatively, the heat from the reflow, depending on how you performed it, can place strain on other components, like the CPU's BGA or the RAM. The bumps or ram could be shot. Unfortunately, those errors don't give us enough clues to go off of. What I do know is that they are occurring in a power state 80, which means the system made it past POST. The power up sequence and BitTraining occur during POST. The BGA pads that usually crack first and cause a YLOD are Voltage and FlexIO data lines. So your reflow re-connected those and is erring in the OS.

A YLOD can change to a GLOD for various reasons. A problem with any of the following will cause a GLOD:
1. CPU RAM Issues
2. RSX RAM Issues
3. AV / HDMI Issues
4. OS Software Issues
5. HDD Issues

Did you thoroughly clean the BGA with electronics contact cleaner (before and after the reflow)? What flux did you use? What equipment did you use (heat gun? Bottom heater? What settings?). What temperatures specifically?! Did you dry the board and if so how long at what temps? Did you nudge the chip to confirm it went molten? These details are important. If you performed the usual "heat gun special," then there are MANY ways it could have failed. Youtube videos like that Ifixit one are terrible advice.

Like @Sampsonay said, cleaning before/after a reflow is very important to achieving a good result. And if you didn't thoroughly dry the board, then moisture can popcorn IC's and squeeze the solder out from underneath the RAM's underfill. That could certainly cause a GLOD! Even if you did give the reflow the best chance of success, reflows can't remove oxidation off BGA pads or replace the old lead free solder like reballs can. So they'll always be inferior.

It's possible a software issue/OS corruption can cause a Livelock situation. Have you tried rebuilding the database? Can you even access the safe boot screen?

1601 is a "Livelock" situation. It's "where a request for an exclusive lock is denied repeatedly, as many overlapping shared locks keep on interfering with each other. The processes keep on changing their status, which further prevents them from completing the task." Imagine 2 people in a rush to get somewhere pass each other on the street. They both need to run to make an important appointment. They have to run past each other, but both of them swerve in the same direction, blocking one another, and have to stop. Then they try again choosing the same direction and have to stop again. Over and over again. Neither can get anywhere until they can agree on a direction and by the time do they've both missed their appointments. That's a crude analogy, but basically the process that needs to complete can't. Hardware and software issues can cause this situation.


Just an update, today I tried to get the system in safe mode but ended up with YLOD again, just 3 beeps, I am guessing next step will be Nec/Tokin replacement
 
Did you thoroughly clean the BGA with electronics contact cleaner (before and after the reflow)? What flux did you use? What equipment did you use (heat gun? Bottom heater? What settings?). What temperatures specifically?! Did you dry the board and if so how long at what temps? Did you nudge the chip to confirm it went molten? These details are important. If you performed the usual "heat gun special," then there are MANY ways it could have failed. Youtube videos like that Ifixit one are terrible advice.

Since a reball is not an option for me (or most of us), is there a guide that shows the proper reflow procedure? The things you mentioned above like using contact cleaner before and after and nudging the chip are things I've never heard or seen before during a reflow (but makes sense). I really have nothing to lose at this point since the PS3 is just sitting there collecting dust. Thanks
 
Just fallow an reball profile to reach melting point and don't take ic of board, is called reflow. As long as you use old alloy contact is called reflow. I never fixed ps3 with reflow, only on ps4 and just for test, after still having a proper reball.
 
... I am guessing next step will be Nec/Tokin replacement
No, your tokins are probably fine. You would only need to replace them if you get a 1002 syscon error.

Assuming your RSX reflow was successful, you've only ruled out the RSX's BGA. Your error history suggests the CPU's BGA could also be the issue (1200 = CPU overheat). The 1200's gave way to the 1601/1701's. Then the 3034/4xxx errors started. So its possible the overheating cpu lead to a BGA defect. We usually assume the RSX, because it occurs there more often, but the CPU isn't immune.

I would suggest a proper reball of both. A reflow can work too, but it's inferior. If that doesn't work, then an RSX replacment is in order. And if that still doesn't work, I think your out of options.

Before you do that, however. You need to troubleshoot the board. Confirm voltages, resistance of certain SMD components, check for blown fuses, clock generator frequencys, etc. If everything seems in order, the it's gotta be the RSX or CPU. I suspect RSX ram or bump failure or a CPU BGA.

Did you delid the cpu and if so, did you inspect for any damaged traces? That's the only other thing that I can think of that we haven't checked yet.
 
Last edited:

Similar threads

Back
Top