Pro Tip: How to avoid database deadlocks when working with CRON jobs

The problem

Database overloading for a large/ commercial system that has a complicated database and too many CRON jobs.

Tech stack in focus:

  • PHP
  • MySQL
  • Apache2
  • CRON jobs

Mistakes you make when designing a system that costs you a lot in future

Some minor mistakes you make when designing a large scale application that cost you a lot and could have been fixed by avoiding a few mistakes include,

Avoiding database overloading.
  1. Designing a database without Foreign keys
  2. Choosing wrong data types and their sizes
  3. Using too many TEXT fields
  4. Database with duplicated data across multiple tables
  5. Not using indexes
  6. Too many joins
  7. Updating a single column over and over again to keep track of system updates
  8. Too many queries to file system
  9. Not setting up a mechanism for getting rid of old unused data

Avoiding the mistakes/ database deadlocks

When we have scheduled tasks and are planning to design a system where we are periodically updating stats, working with the Apache server with PHP in MySQL, instead of using update queries for rows or columns you want to periodically update, try to use insert queries.

1. Designing a database without Foreign keys

The best advice i can give you is to just not rush the step of designing a database. Either you are on a short deadlines or you are being just too careless about this step, its going to cost you major $$$ and time in future.

Take your time, think things through, use dependencies and make your data consistent.

Use Foreign keys everywhere! If you think about the pain it gives you putting them in thinking them through or you just avoid them because of the deletion errors you face when removing data, think again.

If you do the Foreign keys properly the deletion becomes easy and there would be almost no unrelated data when removing a top level database record.

It also protects you from removing a record in a parent table that has child records.

2. Choosing wrong data types and their sizes

The rule here is to don’t over do them.

The max allowed values are the ones you would decide on application level, not the ones the user “CAN” put it. Remember you can always do validation on the application side.

Not all records have the max values put it in so don’t overdo them.

Using unsigned values for auto increment columns and values that may never be negative is another good example.

3. Using too many TEXT fields

TEXT fields are slower, varchar is better. Use TEXT only when the text is going to be really long.

TEXT fields can be part of an index upto a specific length, varchars are better option here too.

Too many TEXT column in a record would make fetching your data slower.

4. Database with duplicated data across multiple tables

Saving data in multiple tables for the sake of convenience (or whatever) is a stupid decision to make.

Do not make this mistake, syncing the data would become a nightmare.

5. Not using indexes

For faster queries try using indexes, when used the data is not traversed from start until finding the record you need. The difference will show up when you have millions of records and each second of delay is costing you $$$

6. Too many joins

One of the biggest mistakes i have seen is this one. Joins with queries, queries with sub queries and joins. Do not over complicate stuff.

Remember there are always multiple ways to do a thing you just need to sit down and think.

7. Updating a single column over and over again to keep track of system updates

Another mistake when planning a cron job is to update a column value to keep record and designing your business logic on top of it is a mistake.

For example you based your logic based on when was the last iteration of CRON ran and whenever it ran you are updating a column value i.e. date field. Sounds genius at first but this is one major reason of deadlocks.

Customers are working on your system updating, fetching and deleting records all the time. As your system grow the chances of overlapping data updates will increase and will most definitly overload your system.

Which in time will become almost impossible to track and fix.

8. Too many queries to file system

Its just too slow to deal with files on a frequent bases. If you are updating records and need to update information frequently put them in database table instead of using a file.

9. Not setting up a mechanism for getting rid of old unused data

One big issue is the data we no longer use, in time the cost of running simple queries increases. Keeping your old data in backups is better than keeping them in your active database tables.

Set up a mechanism to put the data you don’t need out of your system.

Hope you like these suggestions i learned over the years.

Let me know your thoughts and feedback so I can improve this article.

Leave a Reply