When I was in college, the only type of database we studied were relational databases using SQL.
However, I now have an application, where if you tried to use a relational-database, more than 90% of the cell-values would be null and tables would have thousands of columns.
Clearly, a relational database is the wrong choice.
My question is, what type of database is better suited to my application?
If you know what SQL is, then go to the next header/section-break.
In a relational database, everything is stored in spreadsheet-like tables.
Feel free to skip reading this section.
In an example of a relational database, a table recording data for apartment rentals might have the following columns:
- Floor Area Lower Bound (e.g. 450 square feet)
- Floor Area Upper Bound (e.g. 850 square feet)
- Monthly Rent (e.g. $750.00)
- Is Electric Bill is Paid by Landlord? (Boolean)
- Is the water bill paid by the landlord
- Street Address of the apartment (e.g. 123 somewhere lane)
- etc…
I was thinking about creating a database for a job-applicants and job-classifieds.
Traditionally, job classifieds are stored as Unicode strings.
Computers have difficulty parsing and interpreting English.
Humans end up reading the job classifieds, and sorting the job classifieds “by hand.”
Suppose that a prospective job-applicant as no security clearance.
It would be nice if the computer would delete could all rows of the search results containing jobs for which a security clearance is required. This would save people time reading classifieds for jobs they are not qualified for.
The question is whether job J
has at least one job mandatory/minimum qualification Q
such that a human-being ,Sarah
, does not have job-qualification Q
We are often working with 3-valued logic. In a generalized of the mathematical “law of the excluded middle“, One of the following 3 is always the case:
- Jane has a commercial driver’s license.
- Jane does NOT have a commercial driver’s license.
- It is unknown whether Jane has a commercial driver’s license or not.
We want a database where:
- Roles/Positions have qualifications.
- Actors/Job-applicants have qualifications.
If a job requires at least 2 years of Java-programming
, and Ian
has 8 years
of java programming, then we choose NOT to delete that job from the search results we show to Ian
I would prefer that bots filter and prune the search-space as much as possible.
A bot could run a search query, such as “furniture mover” using an traditional nothing-fancy search engine. After that, a bot could identify which job-qualification would split the search results most nearly in half. A 40%-60% split is better than 1% to 99%.
Maybe a commercial driver's license
is a suitable job-qualification. After identifying an attribute to prune-on, the computer can ask the human something like, “do you have a commercial driver’s license?” The answer might cut the search space in half.
Every time the end user is asked a question about their qualifications, the computer stores the answer in a data-base.
A SQL table for job-applicants would have more than 2,000 columns (i.e. job qualifications). Examples of column headers are shown below:
- Number of years of Network domain experience (float)
- Number of years of experience operating fork-lifts (float)
- Are you a licensed plumber (Boolean)
- Do you have a Ph.D in psychology? (Boolean)
- Are you a licensed in the United States to be a professional counselor (LPC)?
- Number of years of experience writing computer code for front ends (float)
- Do you have a CDL (commercial driver’s license)?
I am not willing to record, for every human being under the sun, whether that person has experience using a fork-lift or not, cooking Chinese food, or writing computer programs in java-script.
Perhaps we can “tag” each job-applicant.
- Some users are “tagged
commercial driver's license
=yes
- Some users are “tagged
commercial driver's license
=no
- Some users do not have a tag for
commercial driver's license
at all.
What kind of data-base best supports what I am trying to do?
The website we are currently using, stackexchange.com, has a maximum of something like 8 tags per question.
I would like to be able to support at least a dozen, if not a couple hundred, tags for each stage-role (job) or single stage-actor (job-applicant)