Fixing Broken Symlinks With Find and Replace
This is an advanced tutorial.
It is expected that you already have quite a bit of experience with Linux, the command line, scripting, etc.
I was recently doing some reorganizing on my files, and in doing so I renamed a directory that then caused a couple of hundred symlinks to break. I needed a solution to find the broken symlinks and update them to use the new file path.
Symbolic links (herein known as symlinks) are a great organization tool that allow you to create shortcuts that merely link to other files/folders. This allows a great degree of flexibility for a number of situations. You can use it for file organization, multiple backups with “snapshot” functionality so you can restore a previous version of a file, or even to help with web development.
For example, when I make changes to this website (like adding a new blog post), I first develop and make the changes on a local virtual machine. When I’m happy with it, I use git to push the updates to the web server where this site lives. After git pushes the files, I have a post-receive hook script that performs numerous actions, one of which is to copy the files to the web directory in a new directory based on the current date and time. From there the directory is “symlinked” to a link called “current”, which the web server is configured to look to for its web files.
The benefit of this is if something doesn’t work correctly on the production server, I can easily change the symlink “current” to point back to one of the previous versions of the website and the site is back up and running as it was before I made the change, and it only took a couple of seconds to do so!
I fully intend to write a blog post someday to dig into this method in detail but that is not this post. This post is about solving a problem with symlinks that I recently encountered.
Table of Contents
The Problem
If a symlink is pointing at a file and the path changes (say a folder was renamed), the symlink is then “broken” and will no longer point at the file. This is not much of a problem if you only have one or two symlinks that are broken, you can update them and you’re done. But what if you renamed a main folder higher up in the path which resulted in hundreds of broken symlinks? Yea, manually renaming all of them ain’t gonna happen. To the command prompt!
Example Setup
For the purposes on this post, I’m referring to “soft links” that use absolute paths. It’s possible to do this with relative paths as well, however in my situation I only needed to fix absolute paths so that’s all I’m going to cover here.
Finding existing symlinks
There are a few different ways to find existing symlinks.
Using the “ls” command
The regular “long listing” of the “ls” command will show symlinks that exist in the folder you specify.
$ pwd
/home/blah/Documents/Businesses/ACME/Finances/Client Invoices/2017
$ ls -l
Using the “find” command
The find command is able to look for specific file types (remember everything in Linux is treated as a “file”), such as files, directories, and links. You can use the find command to look in all sub directories to find all the files of the type you’re looking for. In this case we’re going to search for symlinks, type “l”.
$ pwd
/home/blah/Documents/Businesses/
$ find . -type l
Unlike using “ls -l”, the find command will show you the path to the link, but NOT where the link points to. Piping the output of the find command to “wc -l” gives us a count of the number of symlinks that were found. We can see from the image that it found 544 symlinks.
If we want to find all symlinks in a directory, and see where they point to, we can use the “exec” feature of the find command to run the ls -l command on every result.
$ pwd
/home/blah/Documents/Businesses/
$ find . -type l -exec ls -l {} \;
Breaking the Symlinks
We’re going to break all of our symlinks by renaming a folder higher up in the path. Since our fictional file path is a directory of business information, let’s break all the links by renaming our “company”.
$ pwd
/home/blah/Documents/Businesses/
$ ls -l
drwxrwxr-x 4 blah blah 4096 Jan 24 14:01 ACME
$ mv ACME "Rossum Corporation"
$ ls -l
drwxrwxr-x 4 blah blah 4096 Jan 24 14:01 'Rossum Corporation'
The previous find command “find . -type l” will find symlinks whether they are working or broken. If we want to search for only symlinks that are broken, or also find the symlinks and where they point to, we can do the following (notice the change to “xtype”):
$ pwd
/home/blah/Documents/Businesses/
$ find . -xtype -l
$ find . -xtype -exec ls -l {} \;
Subsequently, we can see how many broken links are found by piping to the wc -l command
$ pwd
/home/blah/Documents/Businesses/
$ find . -xtype l | wc -l
That’s way too many broken symlinks to have to try and fix manually.
Fixing Broken Symlinks
So let’s go ahead and fix those broken links. That’s right. All of them. At once.
Let’s have a look at just one of these broken links:
$ pwd
/home/blah/Documents/Businesses/
$ find ~/Documents/Businesses/ -xtype l | tail -n 1
In this case, we’re going to specify the “~/Documents/Businesses/” folder as our location to search so that our output includes the full absolute paths.
So the only thing we need to do is find the text “ACME” and replace it with “Rossom Corporation”. That will restore our symlinks to working order.
Breaking Down the Problem
Let’s break down what we need to do to solve this problem:
- Find a broken symlink
- Identify the “target” of the link and the “name” of the link
- Replace the test “ACME” in the target with the text “Rossum Corporation”
- Update the existing symlink to point to the new target
Well we already know how to find a broken symlink, as well as how to have find execute a command on every result. That just leaves the last three; getting the parts of the link, replacing text, and updating the existing symlinks.
Identifying the target
The command “readlink” can be used to obtain information about a symlink. If readlink is used on just a path to a symlink, it will return the current target:
$ readlink "~/Documents/Businesses/Rossum Corporation/Finances/Client Invoices/2013/LutherCorp"
/home/blah/Documents/Businesses/ACME/Clients/LuthorCorp/Invoices/2013
So that works, but how do we get that link name so we can use readlink? Well, we can use the find command for that. When we do something like this
$ find ~/Documents/Businesses/ -xtype l -exec ls -l {} \;
the “{}” represents the current result of the find command.
So if we did something like this
$ find ~/Documents/Businesses/ -xtype l -exec readlink {} \;
it should actually return the link target instead of the link name.
Replacing Text
This has been referenced in some of my other tutorials, but to replace text we can use the command “sed”. Sed is a “stream editor” whose main purpose is to process text and edit/manipulate it “in stream” before it’s output again.
The format of using sed to replace text is as follows:
$ sed "s/TEXT_TO_FIND/REPLACE_WITH/g"
- The “s” is to specify that a search should occur
- The “g” is to specify that the text should be replaced every time it is found. If “g” was not specified, the replacement would only occur on the first match of the text. We don’t really need to do a global replace here because we’re going to have only one match in every find result anyway, but in this case it also doesn’t hurt to leave it here.
So, we should be able to do something like this in order to replace our text:
$ sed "s/ACME/Rossum Corporation/g"
Updating the Symlink
The last thing that we need to do is update the symlink. The format for creating a symlink is (and yes I have to look it up every bloody time too to see if TARGET goes before or after LINK_NAME):
$ ln -s TARGET LINK_NAME
If you try to run that command on a symlink that already exists though, instead of the symlink being updated it will produce an error. There are two choices; you can either delete the symlink and recreate it, or you can “force” the update to occur.
Using the “-f” switch with “ln” will force the existing symlink to be overwritten, thereby updating it.
Because some of our symlinks point to files and others point to folders, we also need to use the “-T” switch to allow “ln” to update the symlink for directories.
Putting it all Together
Now that we have all our pieces, let’s put it all together to fix all 544 broken symlinks using just one command. We’re going to add an echo in here first to see what changes will be made before we commit.
$ find ~/Documents/Businesses/ -xtype l -exec bash -c 'target="$(readlink "{}")"; link="{}"; target="$(echo "$target" | sed "s/ACME/Rossum Corporation/g")"; echo "ln -Tfs "$target" "$link""' \;
Let’s break down that command to see how it works:
- find ~/Document/Businesses/ -xtype l
- Run the “find” command on the “/home/blah/Documents/Businesses/” directory to look for broken symlinks.
- -exec bash -c ‘….’
- Execute the following command on every result found. Because we want to perform multiple functions, we call “bash -c” to run a string of bash commands. Commands are separated by semi-colons.
- target="$readlink “{}";
- Use the “readlink” command on the find result {} and store the result in the variable called “target”.
- link="{}";
- Store the result (full path) {} in the variable called “link”
- target="$(echo “$target” | sed “s/ACME/Rossum Corporation/g”)";
- Take the existing value of the “target” variable and pipe it into “sed” to find and replace the text “ACME” with “Rossum Corporation”. Update the “target” variable with the new text.
- echo “ln -Tfs “$target” “$link”"' ;
- Use “echo” to test our final command before running live.
- use “ln” to update the existing symlinks in place, using the variables “target” and “link” that we built earlier in the command string.
Now that we know our final “ln” command looks correct to use the new correct path, we can remove the “echo” and let it run for real.
WARNING
The following command(s) can be potential destructive to your data. Please ensure you have
double and triple-checked the command you are about to run AS IT APPLIES TO YOUR OWN
SYSTEM. Preferably you will have also backed up your data before proceeding.
Techbit assumes no responsibility for lost or damaged data.
You have been warned.
$ find ~/Documents/Businesses/ -xtype l -exec bash -c 'target="$(readlink "{}")"; link="{}"; target="$(echo "$target" | sed "s/ACME/Rossum Corporation/g")"; ln -Tfs "$target" "$link"' \;
So now if we search for just broken links (xtype l) there are no results found, doing a listing shows the links are pointing to the correct location, and piping to wc -l again shows that there are 544 links counted.
So there we go, the power of the command prompt again reduces a time consuming manual task down to a few seconds.
If you have any questions/comments please leave them below.
Thanks so much for reading ^‿^
Claire
If this tutorial helped you out please consider buying me a pizza slice!