Oracle 19 JSON Management
Build and query JSON with Oracle 19,21, 23 with this simple cheat sheet.
Build and query JSON with Oracle 19,21, 23 with this simple cheat sheet.
Preselect rows in an Oracle APEX interactive grid, then process these rows. How to use a colon separated list to select rows in a grid.
In the 1960s and 1970s, Dr Edgar Codd, a British computer scientist, developed a means of reducing data duplication with in relational databases. While six “forms” of the process exist, 1-4 ought to be at every database application developer’s finger tips. Fifty, nearly sixty, years have elapsed. Not only do these guidelines still apply today, when we forget them, we can cripple the flexibility and dynamic ability for software to grow and adapt.
The Number Zero of the data normalization forms is called: Unnormalized. While I would love to joke about this structure of data, we all encounter it just too often. Therefore, let’s acknowledge that it exists.
Movin’ on!
To normalize data, one must be working within a relational database environment, such as Oracle and such. The table must have zero, one, or more rows/records. Each row/record must have a primary key. The primary key must be unique. Quick checklist:
Do not add a field called Contact2, ever.
Do not add a field called email2, ever.
Do not add a field called Address2, ever.
If you have multiple contacts, add a table for the contacts. One row for each contact.
If you find that your tables have columns duplicating sideways (laterally), then create a new table! Hey people have multiple email addresses. Create a table with a primary key and 1 email address. Validate the quality of the email to the best of the applications ability. Each new email address is a new row. Done.
This is a casual, but functional means of defining the First Normal Form.
Look at your tables. Do you have important data that is duplicated by users going vertically in a table? You can see this if you have multiple clients called Acme Anvil. If you have five contact people named Wiley E Coyote, then you have an opportunity to improve the data model.
The solution tends to be new tables and breaking the data into smaller building blocks. When done well you work towards both the Second and Third Normal Forms.
When we ask users to type data free formed into fields, we tend to get junk. They enter “data” that is not-really-data. I am not talking about client names or personal names but all of those reason fields and note fields. That free-formed information. We mistype; we abbreviate; we get verbose; we get terse. In short, we get junk. Unless we use artificial intelligence or wicked complicate code, we lose the value of what the user is trying to tell us! Junk is junk!
If data matter to us, then we ought to use lookup tables and codes so that we capture what users need to track. It takes listening and tracking what they are doing. That is a great reason to talk with users and talk with your client. It is their data. We like good data. Good data has value. Bad data is bad.
To satisfy the First Normal Form of relational databases, then:
Let’s imagine a table with client data. The classic table has the unique primary key and the client name. Imagine needing an address. Let’s add an address
column to the table. Into the column, we type the address: Spui 70 (boĆ®te postale 12600),2500 DJ La Haye, Nederland (That’s the Hague). Suddenly, I need another address for the same location. So I add in a delivery address for heavy trucks. If I put a second address into the ADDRESS
field, then I have eliminated the usefulness of the Address field. And I have not followed the First Normalization Form.
Another example with the client table, imagine a field for the email address of your contact person there. Store one address: compliant. Store two in that field and your are not compliant. Game over.
fred@flintstone.fake
is fair
fred@flintstone.fake, barney@flintstone.fake
in the same field breaks the rules.
Imagine that you need to email the client and you have two email addresses in the same field called EMAIL
. If you send an email with one vendor, they want commas to separate. If you use the list with a different email vendor, they want a semi-colon between the two email address. You’ve lost control of the formatting, validity and the usefulness of the email address. It is useless data (unless you write code that parses it out.
Same with printing a mailing label for the Hague address above. You want precisely and exactly one address on a shipping label. If you ADDRESS
field has two, then you are sunk. You lost the data game. No points for you.
To satisfy the Second Normal Form (2NF) then:
Sure, yes, that is in the requisite steps listed above. Sometimes, we create data tables that require applications to always use multiple keys to find a desired row. In this situation, we use multiple or compound keys to access data.
Violating this rule happens all of the time. Can you envision the need to store a client in the client table that may have two division at two locations? Some people would create two separate clients each with one address. But are they the same client or one? We now have the name duplicated. In this row, we have ACME ANVILS
Primary Key | Client Name | Address | |
11235 | Acme Anvil | 123 Roadrunner Ave, WB | |
2358 | Acme Anvil | 123 Wiley Ln, WB |
We do not know if this is one client with two addresses or two clients. We failed to model the complexity of the real-world data. We must solve problems with multiple divisions and multiple addresses. The solution tends to involve adding tables that have greater specificity.
We are creating redundancy in the data going vertically. Which Acme Anvil do we need? How do we identify the one we want?
You want to place your finger on the required data elements with ease. If you don’t know which is which, then how will the application?
Typically, we need to add fields and tables to more completely and more robustly model the data as it is, not as we want it to be. Yes, the real world is messy. Clients have divisions, multiple offices, multiple addresses, multiple contact people. Roll the sleeve up, figure out the data needed and how to relate it.
Do not be afraid view the data world as small data blocks that snap together like Lego bricks.
In this case, maybe we have a table for divisions. We create a table for addresses. Maybe we get super clever and add a client foreign key that references the primary key in its own table. This self-reference creates a tree structure or a hierarchical query. Super powerful for modelling data that nests within itself. With this technique, a client becomes a child of another client: which can represent a division within a client.
Data typed by hand is:
To satisfy the 3rd Normal Form:
Often, when we are asking users to classify or identify the qualities of data, we are asking them to type something. We had a situation today, when a user knew that a client would have a tax ID, but did not yet have a tax ID. So the user typed: 0000.0000.0000 – which meant, “I put in a place holder”. It is not a valid tax id. The team failed to model this event in the lifecycle of client data. We use databases to organize and query data, often massive amounts of data. Consistency matters. Maybe there ought to be a field that says: Tax ID pending. Or maybe there is a lifecycle for the tax ID.
Now we have four states of the tax ID in addition to the tax ID number. We get more information.
To satisfy the Third Normalization Form, we should create a table for Tax ID status:
NONE
)PENDING
)EXISTS
)VALID
)We then use a look up table in conjunction with the Tax ID field to provide a more complete understanding of the data we are capturing. if a Tax ID is empty, we would have to guess. Is the Tax ID empty because the client does not need one? Or is it empty because the Tax ID is pending? If guessing, you’ll be right some times! 50-50, right?
When data become dependent on each other, we add tables and lookups to enforce consistency within the data set.
Had we asked users to type a reason for the absence of the Tax ID, we would see all sort of junk! We turned an opportunity to improve the quality of data into a mess. If you user selects from lists, then we are working harder to satisfy the Third Normal Form. My short hand version of this sounds like this: Use Look up tables to validate data.
Data get complicated because, friends, our world is complicated. We can handle tables with multiple data dependencies. Normal Forms four, five and six to exist. And while I barely understand the terminology used in the academic literature and even on-line. Following these forms lead us into gloriously interesting structures. Child tables that matrix to multiple parent tables. Once playing in these deeper waters, peers tend to sit and discuss how to work through these issues.
When making decisions, we always make trade-offs. We violate the normal forms both accidently and on-purpose. In my podcast, The Soul of an Internet Machine, I discuss data normalization. In the late 1960s, people were not writing software that ran globally supporting 100 spoken languages across two-dozen time zone circling the earth. We did not create a means of supporting data in multiple languages simultaneously. If we duplicate the data laterally for English, French, Dutch, German, and Italian, we violate normal forms. If we create new rows, we may also violate normal forms (depends on the structure). If we store the phrases within JSON data inside of a field, we again violated the rules. So we plan and discuss.
When we add fields such as contact1, contact2, contact3, contact4
or even email1, email2, email3, email4
then we’ve created structural flaws in our applications from which we may never recover. Our application have a limited useful life – until we bring it closer to the Normal Forms. But strict adherence may yield other problems. I get as close as I can to the Normal Forms, then I acknowledge… maybe not.
Quick means of grabbing font apex from CSS text file For those of us that work databases applications from the server-side and feel less sophisticated with client-side activities, I wanted to keep an inventory of the icon available in Oracle APEX to users within standard database tables. Why would a software developer include fonts in a column? Imagine scoring with a star-based rating system, or the simplicity of thumbs up/thumbs down.
SQL Development must work on a global platform but only if we get the settings correct to display languages correctly in our code code. When an accent grave goes bad, it looks like an acute error (that’s a bad pun).