modeling – Creating models for for a subscription-based service using third-party payment gateways

If this question is too broad for Q&A format, I’d appreciate a pointer on where to ask it.

Suppose that you’re using Stripe (or some similar service) to manage the payments for a service of some sort. Stripe has built-in support for recurring payments, a payment gateway.

What I’m struggling to understand is how much of that logic is mirrored over to the application?

For example:

  • Where does the subscription/payment data is stored? Is it mirrored in some local DB or just wrapped from Stripe API?
  • How does one handle various states of a subscription that can’t be fulfilled (card declined, card missing, card fradulent)? Suppose that Stripe only handles payments and not subscriptions, how does one invalidate a subscription if a payment fails?

In other words, should User.first.subscription return something that’s stored in the application or essentially just make a call to Stripe every single time? If that’s the case, should my application not even have a Subscription model as it’s essentially just piped in from Stripe?

My gut is telling me in this case Stripe should handle the entirety of the payment and subscription logic and the service application itself should only wrap Stripe, is this fair to say?

data modeling – What is the normal form of JSON?

This is going to sound like a trivial question, but I like to think it’s actually a deep one. The simple quesiton is, “What is the normal form of a typical JSON object?” For reference, I put an example below, but consider any typcial JSON object you’ve dealt with, same question applies.

I ask this theoretical question for a practical reason. In practice, we often need to convert JSON objects to some set of tables. Once they are tables, the tables have measurable normal forms based on all the usual rules of normal forms.

But getting to those tables with their normal form takes work. Now, what else “takes work”. Answer: going from lower normal forms to higher normal forms. What doesn’t “take work”, is going down the normal forms. Or at least just a trivial amount of work. That is, if I have 6NF, I can rather quickly manipulate my way down to any lower normal form. If I have, say 2NF, and I need to work my way to at least 5NF for some practical reason, I have much work to do.

Well…since it is rather hard to get JSON to any decent normal form, then intuitively it seems it must be in a very low normal form. I’m hoping someone here can quantify that normal form of the JSON. Much apprecaited.

But I still haven’t given the most critical rationale. It is not uncommon for non-technical leaders to ask for miracles. I’m not criticizing, we all know it happens. And the miracle is something of the form, “just write some code to automatically make JSON into tables”.

But wait! If my theory is correct, and JSON is basically 0NF or so, then you can’t automate you way out of it. You can’t go from the very low NF of JSON to anything decent, such as 3NF+, in an automated fashing because that “takes work”. That is, it takes smart humans understanding the domain.

Now, I know some trivial JSON can become some trivial tables. I know there are a few tools that handle the simple cases. But I believe a general purpose JSON-to-Table converter is theoretically not possible because JSON is so low on the normalization information (in the rigorous Claude Shannon sense), that you can’t automate it away.

So, what is the normal form of a typical JSON object? And is there some theory I didn’t find that already proves you can’t automate your way out of this.


  "data": {
    "cust1": {
      "name": "Jane",
      "age": 33,
      "address": "Main Street",
      "favorites": {
        "colors": ("blue", "green")
    "cust2": {
      "name": "Joe",
      "age": 44,
      "address": "West Road",
      "favorites": {
        "colors": ("red", "yellow")

database design – Modeling an either/or relationship

I’ve seen this post (How should I model an “either/or” relationship?) but it’s not exactly what I’m looking for. both answers are suggesting creating a subtype instead of a relationship.

Say I have an entity MACHINE, and I want to creat a relationship to connect it with another entity OS, call it “installs” or whatsoever. And this OS has 2 subtypes: WINDOWS and MAC (Linux and Unix also work but just for demo purpose they are not included). Not considering virtual machine or double OS, I can only choose one of these 2 subtypes of OS, how should I model this?

demo img

Should I

  • Create 1 relationship between MACHINE and OS. Or

  • Create 2 relationship between MACHINE and WINDOWS, MACHINE and MAC. Or

  • Create 1 ternary relationship between MACHINE, WINDOWS and MAC.

And should I add additional attributes to the entities or the relationship?


How much UML modeling should you do?

I’ve used UML modeling a few times in the about a decade ago and I am getting myself reacquainted with it. I found it clarifies an application’s design which results in a faster and easier implementation. I found UML then to be straightforward and simple, but it has grown far more complex, covering more types of modelling along with a bigger vocabulary than when I last used it.

Although I am still getting my head around it, it appears the new additions to UML are merely different ways of capturing the same design. They also seem to go from a high level down to a low level modeling, blurring the lines between design and implementation.

Assuming you’re designing a fairly complex application like Adobe Illustrator for instance, how much of the different modeling types should your design exercise? Are they al la carte where you pick what makes sense for your design, or is it a full 7 course meal where your design best benefits by using all of it? If it is al la carte, how do you determine which types of modeling covers what you need?

probability – Modeling rolling dice

If three dice are rolled and sum to 7, what is the probability at least one of the dice is a 1?

Here’s what I did:

$$Omega = {{x,y,z} : x,y,z in N , 1 le x,y,z le 6}$$

Define an event such that $A_n$ is a set that sums to 7

Sets that sum to seven

$A_1 = {5,1,1}$ , $N(A_1) = {3 choose 1,2} = 3$

$A_2 = {2,2,3}$ , $N(A_2) = {3 choose 1,2} = 3$

$A_3 = {1,2,4}$ , $N(A_3) = n! = 3! = 6$

$A_4 = {1,3,3}$ , $N(A_4) = {3 choose 1,2} = 3$

Define an event such that $B_n$ is all sets that contain at least one die showing numeral 1.

$$P(B|A_n) = frac{N(A cap B)}{N(A_n)} = frac{12}{15}$$

Is this correct?

blender – What is the workflow of modeling 3D rooms?

I’m always struggling with creating flexible, reusable rooms for my underground-set games. When creating reusable rooms I have encountered the following problems:

  • Size – levels can have wildly different sizes and separate objects for all of the sizes seems repetitive and unnecessary
  • Holes in the room – i.e. windows, doors. There may be rooms with exactly the same size but different placement of doors, windows, etc… that only multiplies the amount of possible combinations

Possible solutions:

  • Not creating whole rooms but creating separate walls, floors, roofs so I can mix and match in the engine later?
    • Those parts would still require different sizes, unless I am to put 1 object per square meter or similar
  • Creating flexible Blender model so I can quickly change it export different variations of the room?

Thanks for any ideas, opinions, experiences

sql – Data modeling -Records that have tags across multiple categories

I have a table that stores different software services a company offer. The services are tagged by the Industry it serves, the LoB it belong to, and the technology involved in the service .
The service can have multiple tags on each of Industry,LOB and Technology.

For eg: Following colud be the master data:

enter image description here

And a transaction data could look like this :

enter image description here

I need to create a view that can query data by Industry/LoB and Technology tags. For time being I’ve Left outer joined all tagtoService relation tables(service-technology, service-LoB,Service-Industry tables) to the services transaction table. but this goes for a huge number of records as it is possible to typically have one service tagged to upto 10-15 industries and technologies.

Just wanted to know what is the optimal way to model this data so that i have provision to query for a service by all three tags in one view.

I am not a Data modelling expert and this is more of my first venturing into the data modeling side- so please pardon the noobness in my question :).
I use SAP HANA as the database and expose data via an oData service for which i want to provide the database view as a data provider.

SQL Server & SSDT : Error modeling database during Publish after adding CREATE AGGREGATE / EXTERNAL NAME

I am using SSDT to create a database containing CLR user defined aggregates.

The aggregates are intended to sit in different schema. That is not possible with CLR objects inside SSDT, without using a post-deployment script to move the objects after deployment.

What I am attempting to do is:

  • Create the CLR aggregates in a single schema (the “CLR” schema) using the “Default schema” option
  • Use a CREATE AGGREGATE wrapper in the desired schema to call the CLR aggregate

For example, I have a CLR aggregate like:

(Microsoft.SqlServer.Server.SqlUserDefinedAggregate(Format.UserDefined, MaxByteSize = -1))
public struct StatsEntropy : IBinarySerialize
 ... etc

This get compiled and becomes CLR.StatsEntropy in the database. The database is called AnalysisFunctions, so it has an assembly also called AnalysisFunctions. This is no problem.

I then create a wrapper in the Stats schema like so:

CREATE AGGREGATE (Stats).(Entropy)
    @Values FLOAT
EXTERNAL NAME AnalysisFunctions.StatsEntropy

This works without error when I build via debug. I have my debug connection set to a named instance of SQL Server. If I press F5, it compiles and deploys without any problem. The code is now usable on the server, including the wrapper.

But, if I attempt to publish to a server using Publish, it does not work. Instead it fails with:

Errors occurred while modeling the target database. Deployment cannot continue.

There are no further error messages either in the Data Tools Operations panel or error list. ssdttrace shows no useful information:

ssdttrace event log

This only happens when the database already exists on the server. If I drop the database, it can deploy successfully. I can deploy successfully via debug even if the database already exists, it is only via Publish that the problem occurs. I did not start getting this problem until I added the aggregate wrappers.

Previously I was also getting errors indicating that the assembly AnalysisFunctions had an unresolved reference due to the use of EXTERNAL NAME – again working via debug, not via publish. But now that error has mysteriously vanished.

I am using Visual Studio 2017 15.9.21 and SSDT 15.1.62002.01090, deploying to SQL Server 2019.

Does anyone have any idea what this error might be, or how I can debug it?

modeling – Tool for coordinating/managing independent analysis tools?

not sure this is the right place to ask, but here goes.

I have a task that will involve multiple modeling & simulation tools and data sets developed across my company. You can think of it as Tool_A feeds Tool_B feeds Tool_C, etc. iterating to achieve some result. It’s more complicated then that, but you get the idea.

These tools vary in their maturity, with some pieces yet to be written, and some going back decades. Languages are primarily C/C++ and FORTRAN. Most of the tools read & write their data to files, some are SQL database oriented. Performance is such that all of this can probably initially run on a single large machine, but we’ll eventually need distribute the work across multiple machines. Wholesale re-architecting of these tools isn’t on the table, but it’s possible to open them up and make some changes, if required.

To put it diplomatically, most of the team is oriented more toward physical sciences, math and engineering then software.

So, what I’m looking for is some system/set of tools/whatever you call it to manage & coordinate this enterprise. Something to basically run the overall process – start the individual tools, monitor the progress of each step, manage the selection, translation & transfer of data between tools, assess decision points, etc. All while logging & providing feedback on progress to the user.

We’ve hacked stuff like this together in the past with incrontab & scripting, but it’s always been very fragile and hard to maintain. It seems there has to be a better way, but I don’t know how to look for it. I don’t even know what to put in the tags field!